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Preface 


This is the first part of an intensive 2-year course of algebra for students beginning 
a professional study of higher mathematics. This textbook is based on courses given 
at the Independent University of Moscow and at the Faculty of Mathematics in the 
National Research University Higher School of Economics. In particular, it contains 
a large number of exercises that were discussed in class, some of which are provided 
with commentary and hints, as well as problems for independent solution, which 
were assigned as homework. Working out the exercises is of crucial importance in 
understanding the subject matter of this book. 


Moscow, Russia Alexey L. Gorodentsev 
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Notation and Abbreviations 


N,Z,Q,R,C,H 


=> and <> 

Ve ws: 

Hom(X, Y) 

End(X) = Hom(X, X) 
Aut(X) C End(Xx) 


[M| |G] Al 
pal lvl Mv 
a:b (orb|a) 
a = b(mod n) 
Z/(n) Fy 


GCD, LCM, poset 


Sn 

(01,02,.--,0n) € Sn 
iy, 12, tee slim) € Sn 
K[x] and K[x] 
k[x1,%2,---.Xnlem 
k<&1, &,....En> 

pK K* 


Positive integers, integers, rational numbers, real 
numbers, complex numbers, quaternions 
“implies” and “if and only if” 

“for all,” “there exists,” “such that” 

Set of maps or homomorphisms X > Y 

Set of maps or endomorphisms X —> X 

Group of invertible maps or automorphisms 
xX—>xX 

Cardinality of finite set M or group G, total 
number of cells in Young diagram A 

Distance between points p, g and length (or 
norm) of vector v 

ais divisible by b (or b divides a) 

a is congruent to b modulo n (i.e., (a — b) in) 
Ring or additive group of integers modulo n, 
finite field of g elements 

Greatest common divisor, least common multi- 
ple, partially ordered set 


The symmetric group Aut{1, 2, ... , 1} 
Permutation k +> ox 
Cyclic permutation ij inp +++ Pin Py 


Rings of polynomials and formal power series 
with coefficients in commutative ring K 

Vector space of polynomials of degree at most 
m in variables x1, X2,...,X, With coefficients in 
field k 

Ring of Grassmannian polynomials in (skew 
commuting) variables &, &,..., &) 
Multiplicative groups of nonzero elements in 
field F and of invertible elements in ring K 


V* F* 

VF FY Ft 

Matinxn(K) ,Mat, (K) 
MM! 

(&,u) = &(v) = evy(§) 
(v,w) 

A(V) ,P(V) 

Z(f) C P(V) 


GL(V) ,O(V) , U(V) , PGL(V) 


SL(V) ,SO(V) , SU(V) 


GL,,(k) , PGL,, (kk) , SL, (Kk), ete. 


S’y* 
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Notation and Abbreviations 


Dual vector space for vector space V and dual 
linear map for linear map F 

Left and right adjoint operators for F with 
respect to nondegenerate bilinear form, and 
Hermitian adjoint operator for F 

Module of matrices having m rows and n 
columns, and K-algebra of square n x n matrices 
with elements in ring K 

Transposed matrix or Young diagram 
Contraction between vector v € V and covector 
—Eev* 

Euclidean or Hermitian inner product of vectors 
U,W 

Affine and projective spaces associated with a 
vector space V 

Hypersurface defined by equation f(v) = 0 
Groups of linear, orthogonal, unitary transfor- 
mations of a vector space V, and projective 
transformations of its projectivization P(V) 
Groups of linear, orthogonal, and unitary trans- 
formations of determinant | 

Groups of m xX n matrices obtained from the 
previous groups for V = k” 

Vector space of homogeneous degree-n polyno- 
mials on vector space V 

Quadric Q = Z(q) C P(V) defined by equation 
q(v) = 0, where g € S*V* has polarization q : 
V x V > F and correlation map g: V > V* 


Chapter 1 
Set-Theoretic and Combinatorial Background 


1.1 Sets and Maps 


11.1 Sets 


I have no desire to include a rigorous introduction to the theory of sets in this 
book. Perhaps what follows will motivate the interested reader to learn this theory 
in a special course on mathematical logic. In any case, the common intuitive 
understanding of a set as an abstract “aggregate of elements” is enough for our 
purposes. Any set can be imagined geometrically as a collection of points, and we 
will often refer to the elements of a set as points. By definition, all the elements of 
a set are distinct. A set X may be considered as having been adequately defined as 
soon as one can say that a given item is or is not an element of X. If x is an element 
of a set X, we write x € X. Two sets are equal if they consist of the same elements. 
There is a unique set containing no elements. It is called the empty set and is denoted 
by @. For a finite set X, we write |X| for the total number of elements in X and call 
it the cardinality of X. A set X is called a subset of a set Y if each element x € X 
also belongs to Y. In this case, we write X C Y. Note that @ is a subset of every set, 
and every set is a subset of itself. A subset of a set X that is not equal to X is said to 
be proper. 


Exercise 1.1 How many subsets (including the set itself) are there in a finite set of 
cardinality n? 


Given two sets X and Y, the union X U Y consists of all elements belonging to at 
least one of them. The union of nonintersecting sets Y, Z is denoted by Y LU Z and 
called their disjoint union. The intersection X 1 Y consists of all elements belonging 
to both sets X, Y simultaneously. The set difference X ~ Y consists of all elements 
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that belong to X but not to Y. The direct product! X x Y consists of all ordered pairs 
(x,y), where x € X,y EY. 


Exercise 1.2 Check that the intersection can be expressed in terms of the difference 
as XN Y = X ~ (X ~ Y). Is it possible to express the difference in terms of the 
intersection and union? 


1.1.2. Maps 


A map (or function) f : X — Y froma set X to a set Y is an assignment x +> f(x) 
that relates each point x € X with some point y = f(x) € Y called the image of x 
under f or the value of f at x. Note that y must be uniquely determined by x and f. 
Two maps f : X — Y and g: X = Y are said to be equal if f(x) = g(x) for all 
x € X. We write Hom(X, Y) for the set of all maps X > Y. 

All points x € X sent by the map f : X — Y to a given point y € Y forma subset 
of X denoted by 


f"O)£ we X | f@ =y} 


and called the preimage of y under f or the fiber of f over y. The preimages of 
distinct points are disjoint and may consist of arbitrarily many points or even be 
empty. The points y € Y with a nonempty preimage form a subset of Y called the 
image of f and denoted by 


im(f) = {ye ¥|f'0) 4 @} =e Y|dxeX:f@) =y}. 


A map f : X — Y is called surjective (or an epimorphism) if the preimage of every 
point y € Y is nonempty, i.e., if im(f) = Y. We designate a surjective map by a 
two-headed arrow X —> Y. A map f is called injective (or a monomorphism) if the 
preimage of every point y € Y contains at most one element, i.e., f(x1) 4 f(x2) for 
all x1 4 x2. Injective maps are designated by a hooked arrow X > Y. 


Exercise 1.3 List all maps {0, 1, 2} — {0, 1} and all maps {0, 1} > {0, 1, 2}. 
How many epimorphisms and monomorphisms are there among them in each case? 


A map f : X — Y is called bijective or an isomorphism if it is simultaneously 
surjective and injective. This means that for every y € Y, there exists a unique 
x € X such that f(x) = y. For this reason, a bijection is also called a one-to-one 


‘Also called the Cartesian product of sets. 
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correspondence between X and Y. We designate a bijection by an arrow with a tilde 
over it: X => Y. 


Exercise 1.4 Indicate all bijections, injections, and surjections among the following 
maps: (a) N > N, xb x, (b) Z > Z, x x, (0) Z > Z, xb 7x, (d)Q—>Q, 
xb> Tx. 


A map from X to itself is called an endomorphism of X. We write 
End(X) “& Hom(X,X) for the set of all endomorphisms of X. Bijective 
endomorphisms X = X are called automorphisms of X. We denote the set of 
all automorphisms by Aut(X). One can think of an automorphism X ~ X as a 
permutation of the elements of X. The trivial permutation Idy : X > X,x bh x, 
which takes each element to itself, is called the identity map. 


Exercise 1.5 (Dirichlet’s Principle) Convince yourself that the following condi- 
tions on a set X are equivalent: (a) X is infinite; (b) there exists a nonsurjective 
injection X <> X; (c) there exists a noninjective surjection X —> X. 


Exercise 1.6 Show that Aut(N) is an uncountable set.” 


Example 1.1 (Recording Maps by Words) Given two finite sets X = {1, 2, ... ,n}, 
Y = {1,2,...,m}, every map f : X — Y can be represented by a sequence of 
its values w(f) @ (f(), f(2), ... .f(1)) viewed as an n-letter word in the m-letter 


alphabet Y. For example, the maps f : {1, 2} > {1, 2, 3} and g: {1, 2, 3} > 
{1, 2, 3} defined by the assignments f(1) = 3, f(2) = 2 and g(1) = 1, g(2) = 2, 
g(3) = 2 are represented by the words w(f) = (3,2) and w(g) = (1,2,2) in the 
alphabet {1, 2, 3}. Therefore, we get the bijection 


w: Hom(X, Y) > {|X| — letter words in the alphabet Y}, ft» w(f). 
This map takes monomorphisms to words without duplicate letters. Epimorphisms 


go to words containing the whole alphabet. Isomorphisms go to words in which 
every letter of the alphabet appears exactly once. 


1.1.3 Fibers of Maps 


A map f : X > Y decomposes X into the disjoint union of nonempty subsets f—! (y) 
indexed by the elements y € im(f): 


x= [|| F'o). (1.1) 


yeim(f) 


2A set is called countable if it is isomorphic to N. An infinite set not isomorphic to N is called 
uncountable. 
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This viewpoint may be useful when we need to compare cardinalities of sets. For 
example, if all fibers of the map f : X — Y have the same cardinality m = | f—!(y)|, 
then 


|X| = m-|im f]. (1.2) 


Proposition 1.1 |Hom(X, Y)| = |¥|!*! for all finite sets X, Y. 


Proof Fix an arbitrary point x € X and consider the evaluation map 
ev, : Hom(X,Y) > Y, frf(@), (1.3) 


which takes the map f : X — Y to its value at x. The maps X — Y witha 
prescribed value at x are in bijection with the maps X ~ {x} > Y. Thus, |evy!(y)| = 
|Hom(X ~ {x}, Y)| for all y € Y. Hence, |Hom(X, Y)| = |Hom (X ~ {x}, Y) |-|¥| by 
formula (1.2). In other words, when we add one more point to X, the cardinality of 
Hom(X, Y) is multiplied by | Y]. Oo 


Remark 1.1 In the light of Proposition 1.1, the set of all maps X — Y is often 
denoted by 


y* € Hom(X, Y). 


Remark 1.2 Inthe above proof, we assumed that both sets are nonempty. If X = @, 
then for each Y, there exists just one map @ <> Y, namely the empty map, which 
takes every element of X (of which there are none) to an arbitrary element of Y. In 
this case, the evaluation map (1.3) is not defined. However, Proposition 1.1 is still 
true: 1 = |Y|°. Note that Hom(@, @) = {Idg} has cardinality 1, i.e., 0° = 1 in our 
current context. If Y = @, then Hom(X, @) = @ for every X # @. This agrees with 
Proposition 1.1 as well: 0!X! = 0 for |X| > 0. 


Proposition 1.2 Let |X| = |Y| =n. We write Isom(X, Y) C Hom(X, Y) for the set 
of all bijections X = Y. Then |Isom(X, Y)| = n!, where n! = n-(n—1)-(n—2)-++1. 
In particular, |Aut(X)| = n!. 


Proof For every x € X, the restriction of the evaluation map (1.3) to the subset 
of bijections assigns the surjective map ev, : Isom(X,Y) — Y, f t f(a). 
The bijections f : X = Y with a prescribed value y = f(x) are in one-to-one 
correspondence with all bijections X ~ {x} — Y ~ {y}. Since the cardinality 
of Isom(X ~ {x}, Y ~ {y}) does not depend on x, y, we have |Isom(X, Y)| = 
|Isom((X ~ {x}, ¥ ~ {y})| - |¥| by formula (1.2). In other words, when we add one 
more point to both X and Y, the cardinality of Isom(X, Y) is multiplied by |Y| + 1. 

Oo 


Remark 1.3 The product n! = n- (n— 1)- (n—2)---1 is called n-factorial. Since 
Aut(@) = {Idg} has cardinality 1, we define 0! © 1. 
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Example 1.2 (Multinomial Coefficients) To multiply out the expression (a, + a2 + 
+++ + d,)", we may place the factors in a line: 


(iy Fg Fe Ea) * (Gy da e+ Adi) (Gy a a> ey), 


Then for each i = 1, 2,...,, we choose some letter a,, within the ith pair of 
parentheses and form the word a,,a,, ... ad,, from them. After doing this in all 
possible ways, adding all the words together, and collecting like monomials, we get 
the sum 


n kik 
(a) tag t+: tan)" = ) ( ‘ ) ne , (1.4) 
kitheben phan LSE 
Vi, O<k;<n 


where each exponent k; varies over the range 0 < k; < n, and the total degree of 


seek 


the monomial ala -+-akm is called a multinomial coefficient. It equals the number 
of all n-letter words that can be written with exactly k; letters a,, kz letters az, 
etc. To evaluate it precisely, write Y for the set of all such words. Then for each 
i = 1,2,...,n, mark the k; identical letters a; each with different upper index 
1, 2, ... ,k; in order to distinguish these letters from one another. Now write X for 
the set of all n-letter words written with n distinct marked letters 


a) Q) ki) ) 2 (ks) a) 2) (km) 
a, 54, 4-45 ,A, 5 A, , Ay, «+. A gee Oey Ge yt eg 
So Te 

k, marked letters a; ky marked letters az km marked letters ay, 


and containing each letter exactly once. We know from Proposition 1.2 that |X| = 
n!. Consider the forgetful surjection f : X —» Y, which erases all the upper indices. 
The preimage of every word y € Y under this map consists of the k,! - ko!---K,! 
words obtained from y by marking the k, letters a, ky letters a2, etc. with upper 
indices in all possible ways. (1.2) on p.4 leads to 


! 
“ a2 (15) 
Ki i303 Ran ki) -kol-+-Ky)! 


Thus, the expansion (1.4) becomes 


ky ko 
n+ ay'dy ++-a 
12 
Caw iy = 2 see eee a” 
kup than ETD 22 Sm 
Vi, O<k;<n 
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Exercise 1.7 How many summands are there on the right-hand side of (1.6)? 
For m = 2, we get the following well-known formula’: 


n n!- akp"—* 


Cry =) aan (1.7) 


k=0 


n! n 
k\(n—k)! k 


(,.,"_,)- We will use the notation (/). Note that it can be written as 


The binomial coefficient is usually denoted by either ( ) or C* instead of 


nH) ne(n— 1) (n= k++ 1) 
7 ke (k-1)--+1 ; 
where both the numerator and denominator consist of k decreasing integer factors. 
Example 1.3 (Young Diagrams) The decomposition of the finite set X = 
{1, 2, ... ,m} into a disjoint union of nonempty subsets 


X=X,UX,UXU--- UX, (1.8) 


can be encoded as follows. Renumber the subsets X; in any nonincreasing order of 
their cardinalities and set A; = |X;|. We obtain a nonincreasing sequence of integers 


A= (Ay, A2,...5An), Ay SAg eee Sg, (1.9) 


called a partition of n = |X| or a shape of the decomposition (1.8). Partitions are 
visualized by diagrams like this: 


(1.10) 


Such a diagram is formed by cellular strips of lengths A,, Ao, ... , Ax aligned at the 
left and of nonincreasing length from top to bottom. It is called a Young diagram 
of the partition A. We will make no distinction between a partition and its diagram 
and denote both by the same letter. The total number of cells in the diagram 1 is 
called the weight and denoted by |A|. The number of rows is called the length of the 


3This is a particular case of the generic Newton’s binomial theorem, which expands (1 + x)* with 
an arbitrary a. We will prove it in Sect. 1.2. 
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diagram and denoted by £(A). Thus, the Young diagram (1.10) depicts the partition 
A = (6,5, 5,3, 1) of weight |A| = 20 and length €(A) = 5. 


Exercise 1.8 How many Young diagrams can be drawn within a k x n rectangle?* 


If we fill the cells of A by the elements of X (one element per cell) and combine 
the elements placed in row 7 into one subset X; C X, then we obtain the 
decomposition (1.8) of shape A. Since every decomposition of shape 4 can be 
achieved in this way from an appropriate filling, we get a surjective map from 
the set of all fillings of A to the set of all decompositions (1.8) of shape A. All 
the fibers of this map have the same cardinality. Namely, two fillings produce the 
same decomposition if and only if they are obtained from each other either by 
permuting elements within rows or by permuting entire rows of equal length. Let 
us write m; for the number of rows of length® i in A. By Proposition 1.2, there are 
T] Ai! = [],@)” permutations of the first type and [|/_, m;! permutations of the 
second type. Since they act independently, each fiber has cardinality [ [/_, (i!)"'mi!. 
Therefore, n! fillings produce 


n!} 
TT, mi! Gym 


different decompositions of a set of cardinality n into a disjoint union of m, 
elements, mz subsets of cardinality 2, m3 subsets of cardinality 3, etc. 


(1.11) 


1.2 Equivalence Classes 


1.2.1 Equivalence Relations 


Another way of decomposing X into a disjoint union of subsets is to declare the 
elements in each subset to be equivalent. This can be formalized as follows. A subset 
RCXxX = {(a1, x2) | x1,x2 € X} is called a binary relation on X. If (x1, x2) € R, 
we write x,;~rx2 and say that R relates x; with x2. We omit the letter R from this 
notation when R is clear from context or is inessential. 

For example, the following binary relations on the set of integers Z are commonly 
used: 


equality : x;~x2, meaning that x; = x2; (1.12) 


inequality : x;~x2, meaning that x; < x9; (1.13) 


4The upper left-hand corner of each diagram should coincide with that of the rectangle. The empty 
diagram and the whole rectangle are allowed. 


5Note that the equality |A| =n = m, + 2m) + «++ + nm, forces many of the m; to vanish. 
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divisibility : x;~x2, meaning that x; | xo; (1.14) 


congruence modulo n : x;~x2, meaning that x; = x2 (mod n). (1.15) 


(The last of these is read “x; is congruent to x. modulo n” and signifies that n divides 
X1 — X2.) 


Definition 1.1 A binary relation ~ is called an equivalence relation or simply an 
equivalence if it satisfies the following three properties: 


reflexivity: VxE€X, x~x; 
transitivity : WV x,,x%2,%3 EX, 3 ~HM&Y~X DP X~s; 


symmetry: Wx,,xX2 EX, xy ~ xX. SS 1 ~ x. 


In the above list of binary relations on Z, (1.12) and (1.15) are equivalences. 
Relations (1.13) and (1.14) are not symmetric.° 

If X is decomposed into a disjoint union of subsets, then the relation x;~x2, 
meaning that x), x2 belong to the same subset, is an equivalence relation. Conversely, 
given an equivalence relation R on X, let us introduce the notion of an equivalence 
class of x as 


[le = {z € X | x~pz} = {2 € X | z~px}, 


where the second equality holds because R is symmetric. 
Exercise 1.9 Verify that any two classes [x]r, [y]kx either coincide or are disjoint. 


Thus, X decomposes into a disjoint union of distinct equivalence classes. The set of 
these equivalence classes is denoted by X/R and called the quotient or factor set of 
X by R. The surjective map sending an element to its equivalence class, 


f:X—X/R, xe [le, (1.16) 


is called the quotient map or factorization map. Its fibers are exactly the equivalence 
classes. Every surjective map f : X -—» Y is the quotient map modulo the 
equivalence defined by x; ~ x2 if f(x1) = f(x). 


Example 1.4 (Residue Classes) Fix a nonzero n € Z and write Z/(n) for the 
quotient of Z modulo the congruence relation (1.15). The elements of Z/(n) are 
called residue classes modulo n. The class of a number z € Z is denoted by 
[z], or simply by [z] when the value of n is clear from context or is inessential. 


6They are skew-symmetric, i.e., they satisfy the condition x, ~ x2 & x2 ~ x, => Xx, = Xp} see 
Sect. 1.4 on p. 13. 
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The factorization map 
Z—» Z/(n), z+ [zln, 


is called reduction modulo n. The set Z/(n) consists of the n elements 
[O]n,[1]n,---, [7 — In, in bijection with the residues of division by n. However, 
it may sometimes be more productive to think of residue classes as subsets in Z, 
because this allows us to vary the representation of an element depending on what 
we need. For example, the residue of division of 12!°° by 13 can be evaluated 
promptly as follows: 


[12°] 5 = 12113" = Fis? = [Cn], = (hs. 


Exercise 1.10 Prove the consistency of the above computation, i.e., verify that the 
residue classes [x + y], and [xy], do not depend on the choice of elements x € [x], 
and y € [y],, used in their representations. 


Thus, the quotient set Z/(n) has a well-defined addition and multiplication given by 


dn + [yn = xtyln, Bln: [yn = [xy]n - (1.17) 


1.2.2. Implicitly Defined Equivalences 


Given a family of equivalence relations R, C X x X, the intersection ()R, C Xx X 
is again an equivalence relation. Indeed, if each set R, C XxX contains the diagonal 
A = {(x,x) |x €X} C XxX (reflexivity), goes to itself under reflection (x1, x2) S 
(x2,x,) (symmetry), and contains for every pair of points (x,y), (y,z) € R, the 
point (x, z) as well (transitivity), then the intersection ()R, will inherit the same 
properties. Therefore, for every subset S C X x X, there exists a unique equivalence 
relation S > S contained in all equivalence relations containing S. It is called the 
equivalence relation generated by S and can be described as the intersection of all 
equivalence relations containing S. A more constructive description is given in the 
next exercise. 


Exercise 1.11 Check that x is related to y by R if and only if there exists a 
finite sequence of points x = Zo, Z1, Z2,---;Z, = y in X such that for each 
i=1,2,...,n, either (x1, x;) or (4), x1) belongs to R. 


However, such an implicit description may be quite ineffective even for understand- 
ing whether there are any inequivalent points at all. 


Example 1.5 (Fractions) The set of rational numbers Q is usually introduced as the 
set of fractions a/b, where a,b € Z, b # 0. By definition, such a fraction is an 
equivalence class of the pair (a, b) € Z x (Z~ 0) modulo the equivalence generated 
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by the relations 
(a,b) ~ (ac,bc) forallce Z~0, (1.18) 


which assert the equality of the fractions a/b = (ac)/(bc). The relations (1.18) 
do not themselves form an equivalence relation. Indeed, if ajb2 = ab, then the 
leftmost element in the two-step chain 


(41,1) ~ (ayb2, bbz) = (a2b1, bbz) ~ (az, b2) 


may not be related to the rightmost one directly by (1.18). For example, 3/6 and 
5/10 produce equal fractions and are not directly related. Thus, the equivalence 
generated by (1.18) must contain the relations 


(a, bj) oO (dz, b2) for all ayb> = anb, . (1.19) 


Exercise 1.12 Verify that the relations (1.19) are reflexive, symmetric, and transi- 
tive. 


Hence, relations (1.19) give a complete explicit description for the equivalence 
generated by relations (1.18). 


1.3. Compositions of Maps 


1.3.1 Composition Versus Multiplication 


A composition of maps F : X — Y and g: Y — Zisamap 


gef:X>Z, xv a(f@). 


The notation g ° f is usually shorted to gf, which should not be confused with a 
product of numbers. In fact, the algebraic properties of compositions differ from 
those used in numeric computations. The composition of maps is not commutative: 
fg # ef in general. When fg is defined, gf may not be. Even if both compositions 
are well defined, say for endomorphisms f, g € End(X) of some set X, the equality 
fg = gf usually fails. 


Exercise 1.13 Let two lines ¢;, £2 in the plane cross at the point O. Write o, and 
02 for the reflections (i.e., axial symmetries) of the plane in these lines. Describe 
explicitly the motions 0,02 and 020;. How should the lines be situated in order to 
get 0102 = 0201? 


Cancellation of common factors also fails. Generically, neither fg = fh nor gf = hf 
implies g = h. 
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Example 1.6 (Endomorphisms of a Two-Element Set) The set X = {1,2} has four 
endomorphisms. Let us record maps f : X — X by two-letter words (f(1), f(2)) 
as in Example 1.1 on p.3. Then the four endomorphisms X are (1,1) ,(1,2) = 
Idy ,(2, 1) ,(2, 2). The compositions fg are collected in the following multiplication 
table: 


#N8/0, D 2) @, 1) @, 2) 


g,Y)d,)) 0,1) dg, d,) 
(1,2)}, 1) 1,2) (2,1) (2, 2) 
(2,1)|(2,2) (2,1) 1,2) (1, 1) 
(2, 2)|(2, 2) (2,2) (2,2) (2, 2) 


(1.20) 


Note that (2,2) ° (1,1) # (1,1) e (2,2), (1,1) ° (1,2) = (1,1) e (2, 1), whereas 
(1,2) ¥ (2, 1) and (1, 1) e (2,2) = (2, 1) » (2, 2), whereas (1, 1) ¥ (2, 1). 


The only nice property of numeric multiplication shared by the composition of 
maps is associativity: (hg)f = h(gf) for every triple of maps f : X > Y,g:Y— Z, 
h:Z — T. Indeed, in each case, we have x +> h(g(f(x))). 


Lemma 1.1 (Left Inverse Map) The following conditions on a map f : X —> Y 
are equivalent: 


1. f is injective; 

2. there exists a map g : Y — X such that gf = Idy (any such g is called a left 
inverse to f ); 

3. for any two maps 81, 82: Z — X such that fg, = fgo, the equality g; = go holds. 


Proof We verify the implications (1) > (2) > (3) => (1). Let f be injective. For 
y = f(x), put g(y) = x. For y ¢ imf, define g(y) arbitrarily. Then g : Y > X 
satisfies (2). If (2) holds, then the left composition of both sides of the equality 
fg = fg with g leads to g; = go. Finally, if f(x) = f(x2) for some x; 4 xo, then 
(3) is not satisfied for g; = Idy and gz : X = X that swaps x1, x2 and leaves all the 
other points fixed. Oo 


1.3.2. Right Inverse Map and the Axiom of Choice 


A feeling of harmony calls for the right counterpart of Lemma 1.1. We expect that 
the following conditions on a map f : X — Y should be equivalent: 


(1) f is surjective; 

(2) there exists a map g : Y > X such that fg = Idy; 

(3) for any two maps gj, g2 : Z — X such that gif = gof, the equality gj = g2 
holds. 


If these conditions hold, we shall call the map g from (2) a right inverse to f. Another 
conventional name for g is a section of the surjective map f, because every map g 
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satisfying (2) just selects some element g(y) € f—!(y) in the fiber of f over each point 
y € Y simultaneously for all y € Y. In rigorous set theory, which we try to avoid 
here, there is a special selection axiom, called the axiom of choice, postulating that 
every surjective map of sets admits a section. Thus, implication (1) => (2) is part of 
the rigorous definition of a set. The proof of the implication (2) => (3) is completely 
symmetric to the proof from Lemma 1.1: compose both sides of g; f = g2f with g 
from the right and obtain g; = go. Implication (3) => (1) is proved by contradiction: 
if y € im f, then (1) fails for g; = Idy and every g2 : Y — Y that takes y to some 
point in im f and leaves all other points fixed. Therefore, the above three properties, 
symmetric to those of Lemma 1.1, are equivalent as well. 


1.3.3 Invertible Maps 


If a map f : X = Y is bijective, then the preimage f—'(y) C X of a point y € Y 
consists of exactly one point. Therefore, the prescription y +> f—!(y) defines a map 
f-! : Y — X that is simultaneously a left and right inverse to f, i.e., it satisfies the 
equalities 


feof | =Idy and f lef =Idy. (1.21) 


The map f~! is called a (two-sided) inverse to f. 
Proposition 1.3. The following properties of a map f : X — Y are equivalent: 


(1) f is bijective; 
(2) there exists amap g : Y — X such thatf © g = Idy and ge f = Idx; 
(3) there exist maps g', g” : Y — X such that g’ of = Idx andf ° g” = Idy. 


If f satisfies these properties, then g = g' = g" = f—', where f—' is the map defined 
before formula (1.21). 


Proof If (1) holds, then g = f7! satisfies (2). Implication (2) > (3) is obvious. 
Conversely, if (3) holds, then g’ = g’ oIdy = g’ o (fo g”) = (g’ of) og” = 
Idy o g” = g”. Therefore, (2) holds for g = g’ = g”. Finally, let (2) hold. Then 
for every y € Y, the preimage f—'(y) contains g(y), because f(g(y)) = y. Moreover, 
every x € f(y) equals g(y): x = Idy(x) = g(f(x)) = g(y). Hence, f is bijective, 
andg =f !. o 


1.3.4. Transformation Groups 


Let X be an arbitrary set. A nonempty subset G C AutyX is called a transformation 
group of X if V g1,g2 € G, gig2 € Gand Vg € G, g! € G. Note that every 
transformation group automatically contains the identity map Idy, because Idy = 
g'g for every g € G. For a finite transformation group G, its cardinality |G| is 
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called the order of G. Every transformation group H C Gis called a subgroup of G. 
Every transformation group is a subgroup of the group Aut(X) of all automorphisms 
of X. 


Example 1.7 (Permutation Groups) For X = {1, 2, ..., n}, the group Aut(X) is 
denoted by S,, and called the nth symmetric group or the permutation group of n 
elements. By Proposition 1.2, |S,| = n!. We will indicate a permutation o € S, by 
the row (01, 02,...,0,) of its values 0; = o(i), as in Example 1.1. For example, 


o = (3,4,2,1) and t= (2,3,4,1) 


encode the maps 


1 2 3 4 1 2 3 4 
ee oe ae oe 
3 4 2 1 2 3 4 1 

The compositions of these maps are recorded as ot = (4,2,1,3) and tao = 


(4, 1,3,2). 


Exercise 1.14 For the six elements of the symmetric group S3, write a multiplica- 
tion table similar to that from formula (1.20) on p. 11. 


Example 1.8 (Abelian Groups) A group G in which every two elements f,g € G 
commute, i.e., satisfy the relation fg = gf, is called commutative or abelian. 
Examples of abelian groups are the group T of parallel displacements of the 
Euclidean plane and the group SO of the rotations of the plane about some fixed 
point. For every integer n > 2, rotations by integer multiples of 277/n form a finite 
subgroup of SOp called the cyclic group of order n. 


1.4 Posets 


1.4.1 Partial Order Relations 


A binary relation’ x < y ona set Z is called a partial order if, like an equivalence 
relation, it is reflexive and transitive,® but instead of symmetric, it is skew-symmetric, 
which means that x < y and y < x imply x = y. If some partial order is given, we 


7See Sect. 1.2 on p.7. 
8See Definition 1.1 on p. 8. 
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write x < yifx < yandx # y. A partial order on Z is called a total order if for 
allx,y € Z,x < yorx = yory < x holds. For example, the usual inequality 
of numbers provides the set of integers Z with a total order, whereas the divisibility 
relation n | m, meaning that n divides m, is a partial but not total order on Z. Another 
important example of a nontotal partial order is the one provided by inclusions on 
the set S(X) of all subsets in a given set X. 


Exercise 1.15 (Preorder) Leta set Z be equipped with a reflexive transitive binary 
relation’ x < y. We write x ~ y if both x X y and y X x hold simultaneously. Verify 
that ~ is an equivalence relation and that on the quotient set Z/ ~, a partial order is 
well defined by the rule [x] < [y] ifx X y. 


A set P equipped with a partial order is called a partially ordered set, or poset for 
short. If the order is total, we say that P is totally ordered. Every subset X of a poset 
P is certainly a poset with respect to the order on P. Totally ordered subsets of a 
poset P are called chains. Elements x, y € Z are called incompatible if neither x < y 
nor y < x holds. Otherwise, x, y are said to be compatible. Thus, a partial order is 
total if and only if every two elements are compatible. Note that two incompatible 
elements have to be distinct. 

A map f : M — N between two posets is called order-preserving"” if for all 
x,y € M, the inequality x < y implies the inequality f(x) < f(y). Posets M, N are 
said to be isomorphic if there is an order-preserving bijection M > N. We write 
M ~ N in this case. A map f is called strictly increasing if for all x,y € M, the 
inequality x < y implies the inequality f(x) < f(y). Every injective order-preserving 
map is strictly increasing. The converse is true for maps with totally ordered domain 
and may fail in general. 

An element y € P is called an upper bound for a subset X C P if x < y for 
all x € X. Such an upper bound is called exterior if y € X. In this case, the strong 
inequality x < y holds for all x € X. 

An element m* € X is called maximal in X if for all x € X, the inequality 
m* < x implies x = m*. Note that such an element may be incompatible with some 
x € X, and therefore it is not necessarily an upper bound for X. A poset may have 
many different maximal elements or may not have any, like the poset Z. If X is 
totally ordered, then the existence of a maximal element forces such an element to 
be unique. Minimal elements are defined symmetrically: m, ¢€ X is called minimal 
if Vx € X,m, < x > x = mx, and the above discussion for maximal elements 
carries over to minimal elements with the obvious changes. 


Every such relation is called a partial preorder on Z. 
10 Also nondecreasing or nonstrictly increasing or a homomorphism of posets. 
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1.4.2. Well-Ordered Sets 


A totally ordered set W is called well ordered if every subset U C W has a minimal 
element.'' For example, the set N of positive integers is well ordered by the usual 
inequality between numbers. All well-ordered sets share one of the most important 
properties of the positive integers: they allow proofs by induction. If some statement 
+) = &(w) depends on an element w running through a well-ordered set W, then 
»/(w) holds for all w € W as soon the following two statements are proven: 


(1) 2'(w.) holds for the minimal element w.. of W; 
(2) for every x € W, if X'(w) holds for all w < x, then ¥(x) holds. 


This is known as the principle of transfinite induction. 


Exercise 1.16 Verify the principle of transfinite induction. 


def 


Let us write [y) = {w € W | w < y} for the set of all elements strictly preceding y 
in a well-ordered set W and call it the initial segment of W preceding y. Note that y 
is uniquely determined by [y) as the minimal element in W ~ [y). For the minimal 
element wy of the whole of W, we set [ws) & @. We write U < Wif U ~ [w) 
for some w € W, and write U < Wif U < Wand U # W. As good training in 
the use of the principle of transfinite induction, I strongly recommend the following 
exercise. 


Exercise 1.17 For any two-well ordered sets U, W, either U < W or U ~ W or 
W < U holds. 


Classes of isomorphic well-ordered sets are called cardinals. Thus, the set N can 
be identified with the set of all finite cardinals. All the other cardinals, including N 
itself, are called transfinite. 


1.4.3 Zorn’s Lemma 


Let P be a poset. We write VV(P) for the set of all well-ordered (by the partial order 
on P) subsets W C P. Certainly, W(P) # @, because all one-point subsets of P are 
within W/(P). We also include @ as an element of W(P). 


Lemma 1.2 For every poset P, there is no map B : W(P) — P sending each 
W € W(P) to some exterior upper bound of W. 


Proof Let such a map f exist. We will say that W € W(P) is B-stable if B([y)) = y 
for all y € W. For example, the set {8(@), B({B(@)}), B({B(@). B({B(@)})})} is 


f-stable, and it certainly can be enlarged by any amount to the right. For any two B- 
stable sets U, W € W(P) with common minimal element, either U C W orW CU 


‘Such an element is unique, as we have seen above. 
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holds, because the minimal elements u € U~ (UNM W) and w € Wx (UN W) are 
each the 6-image of the same initial segment [v) = [w) C UN W and therefore 
must be equal. 


Exercise 1.18 Check that the union of all 6-stable sets having the same minimal 
element p € P is well ordered and f-stable. 


Let U be some union from Exercise 1.18. Then 6(U) cannot be an exterior upper 
bound for U, because otherwise, U U {6(U)} would be a B-stable set with the same 
minimal point as U, which forces it to be a subset of U. Contradiction. oO 


Corollary 1.1 (Zorn’s Lemma I) Suppose that every well-ordered subset in a 
poset P has an upper bound, not necessarily exterior. Then there exists a maximal 
element in P. 


Proof Assume the contrary. Then for all x € P there exists y > x. Hence, the 
axiom of choice allows us to choose some exterior upper bound!” b(W) for every 
W € W(P). The resulting map W b b(W) contradicts Lemma 1.2. Oo 


Exercise 1.19 (Bourbaki-Witt Fixed-Point Lemma) Under the assumption of 
Corollary 1.1, show that every map f : P — P such that f(x) = x for all x € X 
has a fixed point, i.e., that there exists p € P such that f(p) = p. 


Definition 1.2 (Complete Posets) A partially ordered set is said to be complete if 
every totally ordered (with respect to the order on P) subset in P has an upper bound, 
not necessarily exterior. 


Lemma 1.3 (Zorn’s Lemma II) Every complete poset P has a maximal element. 


Proof Every complete poset surely satisfies the assumption of Corollary 1.1. oO 


Problems for Independent Solution to Chap. 1 


Problem 1.1 Find the total number of maps from a set of cardinality 6 to a set of 
cardinality 2 such that every point of the target set has at least two elements in its 
preimage. 

Problem 1.2 Let X, Y be finite sets, |X| > |Y. How many right inverse maps does a 
given surjection X —> Y have? How many left inverse maps does a given injection 
Y & X have? 


'2To be more precise (see Sect. 1.3.2 on p. 11), let 1 C W x P consist of all pairs (W,c) such 
that w < c for all w € W. Then the projection z, : 1 — W, (W,c) > W, is surjective, because 
by the assumption of the lemma, for every W, there exists some upper bound d, and then we have 
assumed that there exists some c > d. Take b : W — P to be the composition zz ° g, where 
g:W — Tisany section of 7; followed by the projection 22 : 1 > P, (W,c) tc. 


1.4 Posets 17 


Problem 1.3 How many distinct “words” (i.e., strings of letters, not necessarily 
actual words) can one get by permuting the letters in the words: 


a)algebra, (b , (ec sae abb... b, 
(a)algebra, (b)syzygy, (c)jaa a 
a B 
(d)aja,...a] Ava2...a2... Anam... am? 
—_———_— O_o —_—_—_ 
Qa} a2 Am 


Problem 1.4 Expand and collect like terms in (a) (a; +a2+---+am)*, (b)(a+b+c)*. 

Problem 1.5 Given m,n € N, how many solutions does the equation x; +x2-+-++:+ 
Xm = n have in (a) positive, (b) nonnegative, integers x;,x2,...,Xm? 

Problem 1.6 Count the number of monomials in n variables that have total degree!* 
(a) exactly d, (b) at most d. 


Problem 1.7 Is 1000!/ (100!!°) an integer? 
Problem 1.8 For a prime p € N, show that every binomial coefficient @) with 
1 <k < (p—1) is divisible by p. 
Problem 1.9 Evaluate the sums: (a) (7) +(")+---+("),(b) (J4(V)+4("S)4e- 
OM+CE A +E) @ GO) +26) 4-- +20), © +20) +--+ 
n n n n n n(n n 2 n 2 n 2 
@+)(G) (O-QM+Q-G+--+ED"().@  +Q) ++) 
Problem 1.10 For given m,n € N, count the total number of (a) arbitrary, (b) bijec- 


tive, (c) strictly increasing, (d) injective, (e) nonstrictly increasing, (f) nonstrictly 
increasing and surjective, (g) surjective maps {1, 2, ... ,m}— {1, 2, ... ,n}. 


Problem 1.11 Count the total number of Young diagrams (a) of weight 6, (b) of 
weight 7 and length at most 3, (c) having at most p rows and g columns. 


Problem 1.12* (by L.G. Makar-Limanov). A soda jerk is whiling away the time 
manipulating 15 disposable cups stacked on a table in several vertical piles. 
During each manipulation, he removes the topmost cup of each pile and stacks 
these together to form a new pile. What can you say about the distribution of cups 
after 1000 such manipulations? 


Problem 1.13 Given four distinct cups, four identical glasses, ten identical sugar 
cubes, and seven cocktail straws each in different color of the rainbow, count the 
number of distinct arrangements of (a) straws between cups, (b) sugar between 
cups, (c) sugar between glasses, (d) straws between glasses. (e) Answer the same 
questions under the constraint that every cup or glass must have at least one straw 
or sugar cube (possibly one or more of each) in it. 


Problem 1.14 The sides of a regular planar n-gon lying in three-dimensional 
space are painted in n fixed different colors, one color per side, in all possible 
ways. How many different painted n-gons do we get if two colored n-gons are 
considered the same if one can be obtained from the other by some motion in 
three-space? 


The total degree of the monomial x/''x5? «++ equals )0\_, mj. 
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Problem 1.15 How many different necklaces can be made from 5 red, 7 blue, and 
11 white otherwise identical glass beads? 


Problem 1.16 All the faces of a regular (a) cube, (b) tetrahedron, are painted using 
six fixed colors (different faces in distinct colors) in all possible ways. How many 
different painted polyhedra do we get? 


Problem 1.17 How many different knick-knacks do we get by gluing pairs of the 
previously painted (a) cubes, (b) tetrahedra face to face randomly? 


Problem 1.18 Show that Zorn’s lemma, Lemma 1.3, is equivalent to the axiom of 
choice. More precisely, assume that Lemma 1.3 holds for every poset P and prove 
that every surjective map f : X —» Y admits a section. Hint: consider the set of 
maps gy : U — X such that U C Y and fgy = Idy; equip it with a partial order, 
where gy < gw means that U C W and gwly = gu; verify that Lemma 1.3 can 
be applied; prove that every maximal gy has U = Y. 


Problem 1.19 (Hausdorf’s Maximal Chain Theorem) Use Lemma 1.3, Zorn’s 
lemma, to prove that every chain in every poset is contained in some maximal 
(with respect to inclusion) chain. Hint: consider the set of all chains containing a 
given chain; equip it with the partial order provided by inclusion; then proceed as 
in the previous problem. 

Problem 1.20 (Zermelo’s Theorem) Write S(X) for the set of all nonempty 
subsets in a given set X including X itself. Use the axiom of choice to construct 
a map jt : S(X) — X such that 4(Z) € Z for all Z € S(X). Write W(X) for the 
set of all W € S(X) possessing a well ordering such that (W ~ [w)) = w for all 
w € W. Verify that W(X) 4 @, and modify the proof of Lemma 1.2 on p. 15 to 
show that X € W(X). This means that every set can be well ordered. 


Chapter 2 
Integers and Residues 


2.1 Fields, Rings, and Abelian Groups 


2.1.1 Definition of a Field 


Given a set X, a map X x X — X is called a binary operation on X. Addition 
and multiplication of rational numbers are binary operations on the set Q taking 
(a,b) € Ox Qtoa+b € Qandab € Q respectively. Informally speaking, a field is 
a numeric domain whose elements can be added, subtracted, multiplied, and divided 
by the same rules that apply to rational numbers. The precise definition given below 
takes these rules as axioms. 


Definition 2.1 A set F equipped with two binary operations F x F — F, addition 
(a,b) a+ band multiplication (a, b) + ab, is called a field if these operations 
satisfy the following three collections of axioms: 
PROPERTIES OF ADDITION 
commutativity. a+tb=b+a Va,beF (2.1) 
associativity: a+(b+c)=(a+b)+c Va,b,ceF (2.2) 


existence of zero. AOEF: a+0=a VaeF (2.3) 


existence of opposites. WaeéeF JA(-—a)€F: a+(-a)=0 (2.4) 


PROPERTIES OF MULTIPLICATION 


commutativity. ab=ba Va,beF (2.5) 
associativity: a(bc) = (ab)c VWa,b,ceF (2.6) 

existence ofunit, J1leF: la=a VaeF (2.7) 
existence of inverses. WaeF.NO J a'eF: aa'!=1 (2.8) 
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RELATIONS BETWEEN ADDITION AND MULTIPLICATION 
distributivity: a(b+c)=ab+ac Va,b,céEF (2.9) 
nontriviality: 041 (2.10) 


Example 2.1 (Field of Two Elements) The simplest set that satisfies Definition 2.1 
consists of the two elements 0, 1, withO + 1 = 1-1 = 1 and all the other sums and 
products equal to 0 (including 1 + 1 = 0). It is denoted by Fp. 


Exercise 2.1 Verify that F satisfies all the axioms of Definition 2.1. 


Elements of F, can be interpreted either as residue classes modulo 2 added and 
multiplied by the rules (1.17) from Example 1.4 on p. 8 or as logical “false” = 0 and 
“true” = 1. In the latter case, addition and multiplication become logical “XOR” and 
“AND” respectively,! and algebraic expressions in F7 can be thought of as logical 
predicates. 


Exercise 2.2. Write down a polynomial in x with coefficients in F, that evaluates to 
NOT x and a polynomial in x, y that evaluates to x OR y.” 


Example 2.2 (Rational Numbers) The field of rational numbers Q is the main 
motivating example for Definition 2.1. As a set, Q consists of fractions a/b, which 
are equivalence classes’ of pairs (a,b), where a,b € Z, b # O, modulo the 
equivalence generated by the relations 


(a,b) ~ (sa,sb) VsEZ~O0. (2.11) 
This equivalence is exhausted by the relations 
(ay, by) en (ao, bo) for all ayb2 = aby, (2.12) 


and each relation (2.12) can be achieved by at most a two-step chain of rela- 
tions (2.11). Addition and multiplication of fractions are defined by the rules 


a (2.13) 


ae Gd +be ac 
b dd bd’ 


2 
d bd 


’ 


SIS 


‘Logical “exclusive OR” (XOR): a + b is true if and only if precisely one of a, b is true (and not 
both). Logical AND: a- b is true if and only if both a and b are true. 


Logical NOT x evaluates to true if and only if x is false. Nonexclusive x OR y is true if and only if 
at least one of x and y is true. 


3See Example 1.5 on p. 9. 
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Exercise 2.3 Verify the consistency of these definitions and that axioms (2.1)- 
(2.10) are satisfied. 


Example 2.3 (Real Numbers) The set of real numbers R can be defined in several 
ways: either as a set of equivalence classes of rational Cauchy sequences,” or as 
the set of Dedekind sections® of Q, or as the set of equivalence classes of infinite 
decimal’ fractions.* Whichever definition of R is chosen, the description of addition 
and multiplication as well as verification of axioms (2.1)-(2.10) requires some 
intellectual effort, which is generally undertaken in the first chapters of a course 


in real analysis. I hope that you have taken such a course. 


2.1.2 Commutative Rings 


A set K equipped with addition and multiplication is called a commutative ring 
with unit if these operations satisfy all the axioms of Definition 2.1 on p. 19 
except for (2.8). This means that not all nonzero elements are invertible. The main 
motivating examples of commutative rings with unit are provided by the ring of 
integers Z and the rings K|[x] of polynomials in the variable x with coefficients in an 
arbitrary commutative ring K with unit. 

If the existence of a unit and the nontriviality axioms (2.7), (2.10) are also 
excluded along with (2.8) from Definition 2.1 on p. 19, a set K equipped with two 
operations possessing all the remaining properties is called just a commutative ring. 
The even integers and the polynomials with even integer coefficients are examples of 
commutative rings without a unit. The zero ring, consisting of only the zero element, 
is also a commutative ring. 


2.1.3 Abelian Groups 


A set A equipped with one binary operation A x A — A is called an abelian group 
if the operation satisfies the first four axioms (2.1)—(2.4) from Definition 2.1, i.e., if 
it is commutative and associative, and possesses a zero element as well as opposite 


4That is, check that the equivalence classes of the results are not changed when the operands are 
replaced by equivalent fractions. 

5A sequence is said to be Cauchy if for every positive ¢, all but a finite number of elements of the 
sequence lie within an interval of length ¢. Two Cauchy sequences {a,}, {b,} are equivalent if the 
sequence {a1,b,, a, bz,...} is Cauchy. 

6A Dedekind section is a partition Q = XLIY such that there is no minimal element in Y and x < y 
forallxe X,yeEY. 

7Or any other positional scale of notation. 


8Such an equivalence identifies the decimal a,a) ... d,.b,bz ... b,999... with the decimal 
aya... dn by by... bn—1(bm + 1) 000..., where a;, b; are decimal digits and by, ¥ 9. 
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elements to all a € A. Thus, every commutative ring K is an abelian group with 
respect to addition. This group is called the additive group of the ring K. The main 
motivating example of an abelian group not related directly to a ring is provided by 
vectors. 


Example 2.4 (Geometric Vectors) In the framework of Euclidean geometry as 
studied in high school, let us declare two directed segments to be equivalent if 
they are parallel displacements of each other. The equivalence classes of directed 
segments are called geometric vectors. The zero vector, 1.e., the class of the empty 
segment, is also considered a vector. Vectors can be depicted by arrows considered 
up to a translation in the plane. An addition of vectors is defined by the triangle 
rule: translate the arrows representing vectors a, b in such a way that the head of a 
coincides with the tail of b and declare a + b to be an arrow going from the tail of 
a to the head of b. Commutativity and associativity of addition are established by 
means of the parallelogram and quadrangle diagrams shown in Figs. 2.1 and 2.2: 


Fig. 2.1 The parallelogram rule 


Fig. 2.2, The quadrangle rule 


The opposite vector —a of a is obtained by reversing the direction of a. 


Example 2.5 (The Multiplicative Group of a Field) The properties of multiplication 
listed in axioms (2.5)—(2.8) from Definition 2.1 on p. 19 assert that the set of nonzero 
elements in a field F is an abelian group with respect to multiplication. This group 
is called the multiplicative group of F and is denoted by F* = F ~ 0. The role of the 
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zero element in the multiplicative group is played by the unit. In an abstract abelian 
group, such an element is called the identity element or neutral element of the group. 
The multiplicative version of the opposite is provided by the inverse element. 


Lemma 2.1 Jn every abelian group A, the neutral element is unique, and for each 
a € A, its opposite —a is uniquely determined by a. In particular, —(—a) = a. 


Proof Let us write + for the operation in A. If there are two identity elements 0, 
and 02, then 0; = 0; + 02 = 02, where the first equality holds because 02 is an 
identity element, and the second holds because 0; is an identity element. If there are 
two elements —a and —a’ both opposite to a, then —a = (—a) + 0 = (—a) + (a + 
(—a’)) = ((-a) + a) + (-a’) = 0+ (-a’) = -d’. Oo 


Lemma 2.2 In every commutative ring K, the equality 0-a = 0 holds for alla € K. 
If K has a unit, then for every a € A, the product (—1)-a equals the opposite element 


of a. 


Proof Leta-0 = b. Thenb+a=a-0+a=a-0+a-l1=aQ04+1l)=a-l=a. 
Adding (—a) to both sides, we get b = 0. The second statement follows from the 
computation (-1)-a+a= (—1)-a+1-a=((-1)+1)-a=0-a=0. Oo 


Remark 2.1 In the presence of all the other axioms, the nontriviality axiom (2.10) in 
the definition of a field is equivalent to the requirement F 4 {0}. Indeed, if 0 = 1, 
then for eacha € F, we getta=a-l1=a-:0=0. 


2.1.4 Subtraction and Division 


It follows from Lemma 2.1 that in every abelian group, a subtraction operation is 
well defined by the rule 


a—b#a+(-b). (2.14) 


In particular, subtraction is defined in the additive group of every commutative ring. 

It follows from Lemma 2.1 and Lemma 2.2 applied to the multiplicative group of 
a field that every field F has a unique unit and for every a € F%*, its inverse element 
a~' is uniquely determined by a. Therefore, F* admits a division operation defined 
by the rule 


a/b“ ab™, where b £0. (2.15) 
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2.2 The Ring of Integers 


2.2.1 Divisibility 


An element a in a commutative ring K with unit is said to be invertible if there 
exists an element a~! € K such that a~'a = 1. Otherwise, a is noninvertible. In 
the ring of integers Z, there are just two invertible elements: +1. In the ring Q[x] of 
polynomials in x with rational coefficients, the invertible elements are the nonzero 
constants, i.e., the nonzero polynomials of degree zero. 

An element a in a commutative ring K is said to be divisible by b € K if there 
is an element gq € K such that a = bq. In this case, we write b | a or a:b and call 
q the quotient of a by b. Divisibility is closely related to the solvability of linear 
equations. 


2.2.2 The Equation ax + by = k and the Greatest Common 
Divisor in Z 


Let us fix some a, b € Z and write 
(a, b) = {ax + by | x,y € Z} (2.16) 


for the set of all integers represented as ax + by for some integers x, y. This set is a 
subring of Z, and for every z € (a, b), all its multiples mz lie in (a, b) as well. Note 
that we have a, b € (a, b), and every element of (a, b) is divisible by every common 
divisor of a and b. Write d for the smallest positive number in (a,b). For every 
z € (a,b), the remainder r of division of z by d lies in (a,b), because r = z — kd 
and both of z, kd lie in the ring (a,b). Since 0 < r < d, we conclude that r = 0 
by the choice of d. Thus, the set (a, b) coincides with the set of all multiples of d. 
Therefore, d divides both a and b and is divisible by every common divisor of a 
and b. The number d is called the greatest common divisor of a,b € Z and is 
denoted by GcD(a, b). 

By the way, the arguments above demonstrate that for given a,b,n € Z, the 
equation ax + by = nis solvable in x, y € Z if and only if n! GCD(a, b). 


Exercise 2.4 Given an arbitrary finite collection of numbers a), 4a2,...,dm € Z, 
use an appropriate generalization of the previous construction to produce a number 
d = ayx, + doxX2 +--+ + GnXm, x; € Z, that divides all the a; and is divisible by all 
common divisors of the a;. Prove that the equation a,x, + d2X2 +++ + GmXm = nis 
solvable in x; € Z if and only if nid. 


The number d from Exercise 2.4 is denoted by GCD(qj, d2,...,@m) and called the 
greatest common divisor of a1, d2,...,4n- 
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2.2.3 The Euclidean Algorithm 


The Euclidean algorithm computes GCD(a,b) together with the expansion 
GCD(a, b) = ax + byas follows. Let a = b. Put 


Eo =a, E, = b, Ex = remainder of division of Ex,—2 by Ex—1 (for k = 1). 
(2.17) 


The numbers E; are strictly decreasing until some E, divides E,-;, and we get 
E,41 = 0. The last nonzero number E, in the sequence EF; is equal to GCD(a, b). 


Exercise 2.5 Prove this. 


During the calculations, it is not onerous to write down all the numbers E, in the 
form x: E) + y- E;. This leads to the required representation FE, = x- Eo + y- E|. For 
example, for a = 10203 and b = 4687, the computation consists of seven steps: 


Ep =10203 

E\ = 4687 

b= 9 Rate bie <2; 

B= MH Base <se) ate, 

B= R= aE £6e; =108; 

Es= 255= E£;—-E,= —11E) +242, (2.18) 
R= WS Baer 417m) —=372, 

= 2S ha TE= -130ny 4283R; 

B= t= BeBe aigy. =s08; 

[#5 = 02%) —31 Fe, =4687 RH +10 2036; | 


This shows that GCD(10 203, 4687) = 1 = 147- 10203 — 320- 4687. The bottom 
row in brackets was included to check the result. Moreover, it computes the least 
common multiple LCM(a, b) together with the associated factors LCM(a, b)/a and 
LCM(a, b)/b. 


Exercise 2.6 Prove that in the expression 0 = E,4,; = goEo +4, £\, which appears 
after the last step of the Euclidean algorithm, the absolute value |goEo| = |qiF1| is 
equal to LCM(a, b). 


Remark 2.2. The Euclidean algorithm is much, much faster than prime factorization. 
To convince yourself of this, just try calculate the prime factorizations for 10 203 or 
4687 by hand and compare this with the hand calculation (2.18). Given the product 
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of two very large prime numbers, recovering those primes is too difficult even for 
supercomputers. Many data encryption systems are based on this fact. 


2.3 Coprime Elements 


In the ring Z, the condition GCD(a, b) = 1 is equivalent to the solvability of the 
equation ax + by = 1 in x,y. Integers a, b for which this equation is solvable are 
said to be coprime. 

For an arbitrary commutative ring K with unit, the solvability of the equation 
ax + by = 1 forces every common divisor of a, b to be invertible in K, because 
a=da,b=dB, and ax + by = 1 imply d(a + B) = 1. However, if all common 
divisors of a, b are invertible, the equation ax+ by = | may be unsolvable in general. 
For example, in the ring Q|x, y] of polynomials in x, y with rational coefficients, 
the monomials x and y do not have nonconstant common divisors, but the equality 


f(y) -x + g(x, y)-y = 1 fails for all f, g € Q|x, y]. 
Exercise 2.7 Explain why. 


Nevertheless, just the solvability of ax + by = 1 leads to most of the nice properties 
of a, b known for coprime integers. Therefore, for arbitrary rings the next definition 
is reasonable. 


Definition 2.2 Elements a, b of an arbitrary commutative ring K with unit are called 
coprime if the equation ax + by = 1 is solvable in x,y € K. 


Lemma 2.3 Let K be an arbitrary commutative ring with unit and let a,b € K be 
coprime. Then for every c € K, 


blac => ble, (2.19) 
a|c&b|c => abc. (2.20) 
Furthermore, if a € K is coprime to each of bi, bz, ..., bn, then a is coprime to their 


product b, - bz «++ Dy. 


Proof Multiplying both sides of ax+-by = | by c, we get the equality c = acx+-bcy, 
which gives both implications (2.19), (2.20). If foreachi = 1, 2, ... ,n, there exist 
x;,y; € K such that ax; + b;y; = 1, then by multiplying all these equalities together 
and expanding on the left-hand side, we get 


(bbz +--+ by) - (v1y2°++ Yn) + monomials divisible by a = 1. 


This leads to the required equality a- (something ) + (bi b2---bn)-(yiy2-++ Yn) = 1. 
oO 
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Exercise 2.8 Use Lemma 2.3 to prove the factorization theorem for Z: each element 
n € Zis equal to a finite product of prime integers’ and any two prime factorizations 


Pip2-+*Pk —- /_— 41192°°* Im 
have k = m and (after appropriate renumbering of the factors) pj = +g; for all i. 


Remark 2.3 (GCD in an Arbitrary Commutative Ring) Given two elements a, b in 
a commutative ring K, an element d € K dividing both a and b and divisible by 
every common divisor of a and b is called a greatest common divisor of a and b. A 
greatest common divisor may not exist in general. If it exists, it may not be unique, 
and it may not be representable as d = ax + by. If a ring K possesses a unit, then 
for every greatest common divisor d of a and b and invertible element s € K, the 
product sd is a greatest common divisor of a and b as well. 


2.4 Rings of Residues 


2.4.1 Residue Classes Modulo n 


Recall! that two numbers a,b € Z are said to be congruent modulo n if n divides 
a — b, and in this case, we write a = b(mod n). We know from Example 1.4 on 
p.8 that congruence modulo n is an equivalence relation that decomposes Z into 
a disjoint union of equivalence classes called residue classes modulo n. We write 
Z/(n) for the set of residue classes and denote the residue class of an integer a € Z 
by [a], € Z/(n). Note that the same residue class may be written in many different 
ways: [x], = ly], if and only if x = y + dn for some d € Z. Nevertheless, by 
Exercise 1.10 on p.9, the addition and multiplication of residue classes are well 
defined by the rules 


[a] + [b]= [a+], [a] -[b] = [ad], (2.21) 


in the sense that the resulting residue classes do not depend on the choice of a,b € Z 
representing the classes [a], |b] € Z/(n). The operations (2.21) clearly satisfy the 
definition of a commutative ring with unit, because their right-hand sides deal with 
the operations within the commutative ring Z, where all the axioms hold. Thus, 
Z/(n) is a commutative ring with unit. It consists of n elements, which can be 
written, e.g., as [O],, [1]Jn,--..[(#—- D]n- 


° An integer is prime if it is not equal to the product of two noninvertible integers. 
'0See Sect. 1.2 on p.7. 
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2.4.2 Zero Divisors and Nilpotents 


In Z/(10), the product of nonzero classes [2] - [5] = [10] equals zero. Similarly, in 
Z/(8), the nonzero element [2] has a zero cube: [2]? = [8] = [0]. 

In an arbitrary commutative ring K, a nonzero element a € K is called a zero 
divisor if ab = 0 for some nonzero b € K. Note that an invertible element a € K 
cannot divide zero, because multiplication of both sides of the equality ab = 0 by 
a! forces b = 0. In particular, a commutative ring with zero divisors cannot be a 
field. 

A commutative ring K with unit is called an integral domain if there are no zero 
divisors in K. For example, Z and Q|x] are both integral domains. 

A nonzero element a € K is called nilpotent if a" = 0 for some n € N. Clearly, 
every nilpotent element is a zero divisor. A commutative ring K with unit is called 
reduced if there are no nilpotent elements in K. Therefore, every integral domain is 
reduced. 


Exercise 2.9 Prove that if a is a nilpotent element in a commutative ring with unit, 
then | + a is invertible. 


2.4.3 Invertible Elements in Residue Rings 


A residue class [m], € Z/(n) is invertible if and only if [m],,[x], = [mx]n = [Un 
for some [x], € Z/(n). The latter means the existence of x, y € Z such that mx + 
ny = 1 in Z. Such x, y exist if and only if GCD(m,n) = 1 in Z. This can be 
checked by the Euclidean algorithm, which allows us to find the required (x, y) as 
well if m, n are coprime. Thus, the residue class [x] inverse to [m], if it exists, can 
be easily computed. For example, the calculations made in formula (2.18) on p. 25 
show that the class [10 203] is invertible in Z/ (4687) and [10 203]j47 = [147]46s7. 
At the same time, we conclude that the class [4687] is invertible in Z/(10 203) and 
[4687]! = —[320] in Z/(10 203). 

The invertible elements of a commutative ring K with unit form a multiplicative 
abelian group called the group of invertible elements'! and denoted by K*. 

The group Z/(n)* of invertible residue classes consists of classes [m], € Z/(n) 
such that GCD(m, n) = 1. The order!” of this group is equal to the number of positive 


Tn other terminology, the group of units. 
The order of a group is the number of its elements. 
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integers m that are strictly less than n and coprime to n. This number is denoted by 
y(n) = |Z/(n)*| . 


The map gy: N>N,nt> g(n), is called Euler’s y-function. 


2.4.4 Residue Fields 


It follows from the above description of invertible residue classes that the ring Z/(n) 
is a field if and only if n is a prime number, because only a prime number p is 
coprime to all positive integers less than p. For a prime p € N, the residue class field 
Z/(p) is denoted by F,,. 


Example 2.6 (Binomial Formula Modulo p) For a prime p € N, there is a 
remarkable identity in the residue class field F, = Z/(p), namely 


LHit---+1=0. (2.22) 
———— 


p times 


It forces the sum of m ones to vanish as soon m is a multiple of p. In particular, the 
sum of 


P\ plp=1)-(p=k +) 
k} kk = 1)-+-1 


ones vanishes for all 1 < k < (p— 1). Indeed, by Lemma 2.3 on p. 26, for such 
k, the number p is coprime to the product in the denominator. Then by the same 
lemma, the denominator divides the product (p — 1)---(p —k + 1). Therefore, the 
entire quotient is divisible by p. 

We conclude that in F,,, the binomial formula (1.7) on p. 6 becomes 


(a+ bP =a?+P, (2.23) 
because after expanding the left-hand side as explained in Example 2.3 on p.5, we 
get for each k exactly (!) similar monomials a*b?—*, producing the sum of (2) ones 


as the coefficient of a*b?-* on the right-hand side. 


Exercise 2.10 Prove the congruence 


(™") = m(mod p) 
P 


for every prime p € N and all m € N coprime to p. 
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Theorem 2.1 (Fermat’s Little Theorem) a’? = a(mod p) for every a € Zand 
prime p EN. 
Proof We have to show that [a”] = [a] in F,,. This follows immediately from (2.23): 
la? = (+ 0 +- + 0)? =P + OP + + OP 
ee a 


a times a times 


= (+0 +--+ 0) =[d. 


a times 
oO 
2.5 Direct Products of Commutative Groups and Rings 
The set-theoretic product 
] [Av = 1 x 42 & ++ X Am = (a1, 42,..- 44m) | ay € Av} (2.24) 
of abelian groups Aj, A2,...,Ajm possesses the structure of an abelian group with 


addition defined by 
(41, 42,...,4m) + (Bi, b2,.-+ 1m) & (ai + bi, a2 + bo, ... m+ Bm), (2.25) 


where the ith components are added within the ith group A;. 


Exercise 2.11 Verify that addition (2.25) is commutative and associative, its neutral 
element is (0,0,...,0), and the opposite element to (a), a2,..., Qn) is the element 
(—a,, 09 ji 6% 5 —dm). 


The abelian group |] A, obtained in this way is called the direct product of abelian 
groups Aj. If all the groups A; are finite, their direct product is also finite, of order 
\[ [Ai] = [] |Ai|. The direct product is well defined for any family of abelian groups 
A, indexed by elements x € X of an arbitrary set X, not necessarily finite. We write 
Trex Ax for such a product. 

Similarly, a direct product of commutative rings K,, where x runs through some 
set X, consists of families (d,),<y formed by elements a, € K,. Addition and 
multiplication of such families is defined componentwise: 


(4x)rex + (Dx)xex = (Ax + Dxdnex » (ax) rex * Ox)xex = (Ax Dx) rex - 


The resulting ring is denoted by [],<y Kx as well. 


Exercise 2.12 Check that [] K, actually is a commutative ring. If each K, has unit 
1, € Ky, verify that the family (1x),<y is the unit in [| Ky. 
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For example, let X = R and all K, = R as well. Then the product [],<2 R. formed 
by R copies of R is isomorphic to the ring of all functions f : R — R, where the 
operations are the usual addition and multiplication of functions. The isomorphism 
takes a family of real numbers (f,) € [],ep Rx to the function f : R > R, xh fr. 

If some element a, € K;, in the family (a,) € []K,, with not all the a, equal 
to zero, either equals zero or divides zero in K,, then the family is a zero divisor in 
[| Kx. For example, (0, 1, ..., 1) divides zero, because of 


(0, 1,..., 1-1, 0, ..., 0) = (0, 0,..., 0). 


Thus, the direct product of more than one ring cannot be a field. 


Exercise 2.13 Show that for two given prime numbers p, q € N, the product F, x Fy 
consists of the zero element, (p — 1)(q — 1) invertible elements, and p + q — 2 zero 
divisors (a, 0), (0, b), where a £ 0 and b ¥ 0. Note that (F, x F,)* ae XE. 


If all rings K, possess units, then the invertible elements in the direct product 
[[ Kx are exactly those sequences (a,) such that each a, is invertible in Ky. 
Therefore, the group of invertible elements in [| K, coincides with the direct product 
of groups K*: 


(T] K.) =[|[«° (2.26) 


2.6 Homomorphisms 


2.6.1 Homomorphisms of Abelian Groups 


A map of abelian groups g : A — B is called a homomorphism if it respects the 
group operation, that is, if for all a), az € A, the equality 


g(a + a2) = g(a1) + Y(ar) (2.27) 


holds in the group B. 


Exercise 2.14 Verify that the composition of homomorphisms is again a homomor- 
phism. 


Lemma 2.4 For every homomorphism of abelian groups y : A — B, the equalities 
y(0)=0 and g(—a)=-g(a) (foralla€A) 


hold. In particular, im(A) = (A) C B is a subgroup. 
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Proof Since g(0) = g(0 + 0) = g(0) + g(0), subtraction of g(0) from both sides 
forces 0 = y(0). The second equality is verified by the computation y(a)+g(—a) = 
p(a + (—a)) = (0) = 0. o 


2.6.2 Kernel of a Homomorphism 


For every homomorphism of abelian groups g : A — B, the fiber of @ over the zero 
element 0 € B is called the kernel of @ and is denoted by 


kerg “gy '(0) = {a€ A| g(a) = 0}. 
The kernel is a subgroup of A, because y(a,) = y(a2) = 0 implies 


g(a, £ ax) = 9m) + ym) =0£0=0. 


Proposition 2.1 For every homomorphism of abelian groups g : A — B and every 
element b = f(a) € im A, 


f \(b) =at+kerg = {at+d' |a' €kerg}, 


i.e., the fiber of ¢ over b is a shift of ker g by any element a € f—'(b). In particular, 
every nonempty fiber is in bijection with ker, and @ is injective if and only if 
kerg = 0. 


Proof The conditions g(a,) = y(az) and g(a, — az) = 0 are equivalent. Oo 


2.6.3 Group of Homomorphisms 


For any two abelian groups A, B, we write Hom(A, B) for the set of all homomor- 
phisms A — B. The terms monomorphism, epimorphism, and isomorphism assume 
on default that the map in question is a homomorphism of abelian groups. The set 
Hom(A, B) is an abelian subgroup in the direct product B4 of A copies of the group 
B. The inherited group operation on homomorphisms is the pointwise addition of 
values: 


gi + $2: a> gi(a) + Gra). 
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Exercise 2.15 Check that the sum of homomorphisms is a homomorphism as well. 


The neutral element of the group Hom(A, B) is the zero homomorphism,'*> which 
takes every element of A to the zero element of B. 


2.6.4 Homomorphisms of Commutative Rings 


A map of rings g : A — B is called a ring homomorphism if it respects both 
operations, i.e., for all a), a2 € A, it satisfies the relations 


g(a, + a2) = (a1) + Yaz) and p(ajaz) = Y(a1)G(a2). (2.28) 


Since a ring homomorphism g : A — B is a homomorphism of additive groups, 
it possesses all the properties of homomorphisms of abelian groups. In particular, 
(0) = 0, y(—a) = —¢(a), and all nonempty fibers of ¢ are shifts of the kernel: 


y '(y(a)) =a+kery = {a+a' |a' €kerg}, 

where ker y & y~!(0) as above. Therefore, a ring homomorphism ¢ is injective if 
and only if kerg = {0}. The kernel of a ring homomorphism has an additional 
property related to the multiplication. For every a € ker g, all its multiples aa’ also 
lie in ker g, because y(ba) = y(b)g(a) = 0 for every b € A. In particular, ker@ CA 
is a subring. 

The image of a ring homomorphism g : A — B is clearly a subring of B. 
However, a ring homomorphism does not have to respect units, and 1g may be 
entirely outside im(¢). 


Exercise 2.16 Check that the map Z/(2) — Z/(6) sending [0] +> [0], [1] > [3] is 
a ring homomorphism. 


Nevertheless, every nonzero ring homomorphism to an integral domain always takes 
1 tol. 


Lemma 2.5 Let g : A — B be a nonzero homomorphism of commutative rings 
with unit. If B has no zero divisors, then p(1) = 1. 


Proof Since g(1) = g(1- 1) = g(1)- g(), the equality g(1)(1 — g(1)) = 0 holds 
in the integral domain B. Hence, either g(1) = 1 as required, or g(1) = O. In the 
latter case, g(a) = g(1- a) = g(1)- g(a) = OforallacA. Oo 


13 Also called the trivial homomorphism. 
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2.6.5 Homomorphisms of Fields 


If commutative rings A and B are fields, then every nonzero ring homomorphism 
gy : A — Bisa homomorphism of the multiplicative abelian groups g : A* — B*. 
In particular, g(1) = 1 and g(a/b) = y(a)/g(b) for all a and all b 4 0. 


Proposition 2.2. Every nonzero homomorphism of a field to a ring is injective. 


Proof If g(a) = 0 for some a ¥ 0, then g (b) = @ (ba~‘a) = (ba~') g(a) = 0 
for all b. Thus, a nonzero ¢ has zero kernel. Oo 


2.7 Chinese Remainder Theorem 


Let any two of the numbers 7), 72,...,%m € Z be coprime and n = njnz--- nN». The 
map 


g :Z/(n) > (Z/(m)) x (Z/(n2)) x +++ x (Z/(Mn)) 


(2.29) 
[zln b> ([zln [Z]n2» tees [zlnm ) ; 


which takes the residue class z(mod n) to the collection of residue classes 
z(mod nj) , is well defined, because for every z; = Z2 (mod n), the difference z; —z2 
is a multiple of n = nynz---nm, and therefore [z1],; = [Za]n,; for all i. It follows from 
the computation 


9 (Ln + [wln) = 9([e + Wn) = (e+ Wha, [e+ Where + + Wham) 
= ([zlay + [lm [lnz + [Wns «++ + Elan + Wham) 
= ([zlm» [lms «++ + Elam) + (DWlais Waa» «++ > When) 
= 9([zln) + 9((w]n) 


that @ respects the addition. A similar calculation verifies that g respects the 
multiplication as well. Therefore, the map (2.29) is a ring homomorphism. By 
Lemma 2.3 on p.26, every z € Z such that [z],, = 0 for all i is divisible by 
n = Nj - No +++ Nm. Hence, ¢ is injective. Since the cardinalities of both sets Z/(n), 
[| Z/(n;) are equal to n = [[n;, the homomorphism (2.29) is bijective. This fact is 
known as the Chinese remainder theorem. In ordinary language, it says that for every 
collection of remainders 71, 72,...,/% under division by pairwise coprime numbers 
n1,N2,-...,Nm € N, there exists a number z € Z whose remainder on division by n; 
is r; simultaneously for all i, and two numbers z;, z2 sharing this property differ by 
a multiple of n = njnz--- nk. 
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A practical computation of such a number z can be done by means of the 
Euclidean algorithm as follows. By Lemma 2.3 on p. 26, each n; is coprime to the 
product m = Tein of all the other n,’s. Therefore, the Euclidean algorithm 
allows us to find for each i some x;,y; € Z such that njx; + my; = 1. Then 
b; = my; is congruent to 1 modulo n; and to 0 modulo all the other n,’s. Hence, 
Z= rb) + r2b2 + +++ + mbm solves the problem. 


Example 2.7 To demonstrate the effectiveness of the above procedure, let us find 
the smallest positive integer with remainders 7; = 2, rp = 7,73 = 43 on division 
by nj = 57, nz = 91, n3 = 179 respectively.'* We first invert 91 - 179 modulo 57. 
Since 91 - 179 = 34-8 = —13 (mod 57), we can apply the Euclidean algorithm to 
Eo = 57, E; = 13. The output 22-13—5-57 = 1 (check!) means that —22-91-179 = 
1 (mod 57). Thus, the number 


b; = —22-91-179 (= 22-13 (mod 57)) 


produces the remainder triple (1, 0, 0) ondivisionby 57, 91, 179. Similarly, 
we obtain the numbers 


by = —33-57-179 (= 33-11 (mod 91)), 
b3 = —45-57-91 (= 45-4 (mod 179), 


producing the remainder triples (0, 1, 0) and (0, 0, 1) on division by 57, 91, 179. 
The required remainders (2, 7, 43) are produced by 


Z=2b,+ 7b, + 43b3 = —(2-22-91-179+7-33-57-179 + 43- 45-57-91) 
= —(716716 + 2356893 + 10036 845) = —13 110454, 


as well as by all numbers that differ from z by a multiple of n = 57- 91-179 = 
928 473 . The smallest positive among them is equal to z+ 15n = 816641. 


2.8 Characteristic 


2.8.1 Prime Subfield 


For a commutative ring K with unit there is a ring homomorphism x 
Z — K defined by 


x(tn) =+(1 +14 --- +1) forallneN. (2.30) 
See 


n 


149.7, 43, 57, 91, and 179 are the numbers of famous mathematical schools in Moscow. 
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Its image im x C K coincides with the intersection of all subrings in K containing 
the unit element of K. If x is injective, we say that K has zero characteristic and 
write char K = 0. Otherwise, we say that K has positive characteristic and define 
the characteristic char K to be the minimal m € N such that] + 1+ ---+1=0. 


m 


The equality 


Pt1t---+1=(4+14--+1-G+14+-- +) 
—_OC —_—_—_———————" —_—_—_—_oOo 


mn m n 


implies that the characteristic of an integral domain is either zero or a prime 
number p € N. For an integral domain K of positive characteristic p, the 
homomorphism (2.30) takes all multiples of p to zero and therefore can be factorized 
into the composition %, ° 7, of ring homomorphisms 


Tp: Z—->F,,z+ [zp and x, :F,oK, [zp > x(z), (2.31) 


the latter of which is injective by Proposition 2.2 on p. 34, because F, is a field. Thus, 
the smallest subring with unit in an integral domain K of positive characteristic p is 
a field isomorphic to F, = Z/(p). It is called the prime subfield of K. 

For a field F, the prime subfield of F is defined to be the intersection of all 
subfields in F. This is the smallest subfield in F with respect to inclusion. Clearly, 
it contains im(x). If char(F) = p > 0, then the previous arguments force the prime 
subfield to be equal to im x = im(%,) ~ Z/(p). Thus, our second definition of the 
prime subfield agrees with the previous one in this case. 

For a field F of zero characteristic, the homomorphism x : Z — F is injective. 
Since all nonzero elements in im x are invertible within F, the assignment 


p/qt+> x(p)/(q) 


extends x to a homomorphism of fields % : Q < F, which is injective by 
Proposition 2.2 on p.34. We conclude that the prime subfield of F coincides with 
im% and is isomorphic to Q in this case. 


Exercise 2.17 Show that (a) every field endomorphism leaves every element in the 
prime field fixed; (b) there are no nonzero homomorphisms whatever between fields 
of different characteristics. 


In particular, the field Q is pointwise fixed by every automorphisms of the fields R 
and C. 


2.8.2 Frobenius Endomorphism 


The same arguments as in Example 2.6 on p.29 show that for a field F of 
characteristic p > 0, the p-power exponentiation 


F,:F>F, xH x, (2.32) 
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is a ring homomorphism, because (ab)? = a?b? and 


p-l 


(a+ bP=a ++) 1414 +1) det aa +P. 
k=1 


) 


The homomorphism (2.32) is called the Frobenius endomorphism or just the 
Frobenius for short. The previous exercise, Exercise 2.17, says that the Frobenius 
acts identically on the prime subfield F, C IF. This agrees with Fermat’s little 
theorem, Theorem 2.1 on p. 30. 


Exercise 2.18 Show that a field F of characteristic p > 0 is isomorphic to F,, if 
and only if the Frobenius endomorphism F,, : F — F coincides with the identity 
map Idp. 


Problems for Independent Solution to Chap. 2 


Problem 2.1 Compute GCD(a, b) and express it as ax + by with x, y € Z for the fol- 
lowing pairs (a, b): (a)(17, 13), (b)(44 863, 70 499), (c)(8 385 403, 2 442 778). 

Problem 2.2 Find all integer solutions of the following equations: (a)5x+7 y= 11, 
(b)26x + 32y = 60, (c)1537x + 1387y = 1, (d)169x + 221 y = 26, (e)28x+ 
30y+ 31 z= 365. 

Problem 2.3 Find the ninety-first positive integer that has remainders: (a) 2 and 7 
on division by 57 and 179, (b) 1, 2, 3 on division by 2, 3, 5, (c) 2, 4, 6, 8 on 
division by 5, 9, 11, 14. 

Problem 2.4 How many solutions does the equation x? = 1 have in the ring Z/(n) 
for evenn > 4? 


Problem 2.5 Prove that for each m € N, there exists n € N such that the equation 
x? = | has at least m solutions in Z/(n). 

Problem 2.6 How many solutions does the equation (a) x = 1, (b) x? = 49, have 
in the ring Z/(360) ? 

Problem 2.7 For each ring Z/(m) in the range 4 < m < 8, write the multiplication 
table and list all the squares, all the nilpotents, all the zero divisors, and all the 
invertible elements. For each invertible element, indicate its inverse. 

Problem 2.8 Show that: (a) a7+b?:7 > a:7 and b:7, (b) a+b? +c3:7 > abc?7, 
()@7?4+0 474+ ad +e7:9 => abcde:9. 

Problem 2.9 Does the equation x7 + y? + 2? = 2.xyz have any integer solutions 
besides (0, 0, 0)? 

Problem 2.10 Fix some nonzero a € Z/(n) and write a : Z/(n) > Z/(n), 
x +> ax, for the multiplication-by-a map. Prove that the following conditions 
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are equivalent: (a) a is invertible; (b) a is not a zero divisor; (c) @ is injective; (d) 
a is surjective; (e) a is bijective. 

Problem 2.11 (Euler’s Theorem on Residues) Let a € Z/(n) satisfy the 
conditions of the previous problem. Depict all elements of Z/(n) by some points 
on a sheet of paper and for each point x, draw an arrow from x to ax. Prove that 
in this picture: 


(a) movement along the arrows decomposes into disjoint nonintersecting cycles; 

(b) if a cycle goes through an invertible element, then all the elements it goes 
through are invertible; 

(c) all cycles passing through invertible elements are of the same length; 

(d) a?” = 1, where y(n) is the Euler function defined in Sect. 2.4.3 on p. 28. 


Problem 2.12 Are 2222°°°> + 5555”? and 2” + 3” divisible by 7 and 13 
respectively? 


Problem 2.13 Find remainder on division of 201520!” by 11. 


Problem 2.14 For all k € N, find the remainders on division of 10* by 2, 5, 4, 
3, 9, 11, 7, 13. Formulate and prove algorithms'> to calculate the remainder on 
division of a given decimal number d by 2, 5, 4, 3, 9, 11, 7, 13 by looking at the 
digits of d.'° 


Problem 2.15 (Primitive Invertible Residue Classes) Let a € Z/(n)* be an 
invertible residue class. The minimal k € N such that a = 1 is called the order 
of a. We say that a generates the multiplicative group Z/(n)* if this group is 
exhausted by the integer powers a‘. The generators of Z/(n)* are also called 
primitive roots (or primitive residues) modulo n. Show that an invertible residue 
a is primitive if and only if the order of a equals g(n). 


(a) Prove the existence of a primitive residue modulo p for every prime p € N. 

(b) Let a), a2,...,a, € Z/(n)* have pairwise coprime orders k,, ko,..., kn. Find 
the order of the product ay +++ dy. 

(c) For two arbitrary invertible residues a, b of arbitrary orders k, m, construct an 
invertible residue of order LCM(k, m). 

(d) Fix a prime p > 2 and a primitive residue 9 modulo p. Show that there exists 
0 € Nsuch that: (1) (e + pd)?! = 1 (modp); (2) (0 + pd)?! ¥ 1 (modp’) ; 
(3) the class [9 + p] is a primitive residue class modulo p* for all k € N. 

(e) For k € N and prime p > 2, prove the existence of a primitive residue 
modulo 2p*. 

(f) Is there a primitive residue modulo 21? 


Try to make them as simple as possible. 
16For example, the remainder on division by 3 is equal to the remainder of the sum of the digits. 
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Problem 2.16 (Idempotents) An element a of an arbitrary commutative ring with 
unit is called idempotent if a” = a. The idempotents 0 and 1 are called trivial. 
Show that (a) each nontrivial idempotent is a zero divisor; (b) a is idempotent if 


and only if 1 —a is too. (¢) For which v are there nontrivial idempotents in Z/(n)? 
Problem 2.17 Find all idempotents in the ring Z/(n) for (a) n = 6, (b) n = 36, (c) 
n= Pip2*+:Pn, (d) n = pips? +++ pn", where p; are distinct prime numbers. 


Problem 2.18 (Euler’s Formula for g) A function f : N — C is called a 


multiplicative character if f(mn) = f(m)f(n) for every coprime m,n € Z. 
Show that Euler’s function!’ g is a multiplicative character and for every n = 
pi +++ pe, where p; are distinct primes, prove Euler’s formula 


y(n) =n-(1—p;') ++: (1—p,"). 


Find all n € N such that g(n) = 10. 


Problem 2.19 (Mébius Function) The Mobius function uw : N > {-1,0, 1} gives 
(1) = 1 and p(n) = 0 for all n divisible by the square of some prime number. 
Otherwise, j4(n) = (—1)*, where s is the number of positive prime divisors of n. 
Show that jz is a multiplicative character and prove the equality 


1 forn=1, 
Youd) = 


din 0 forn>1, 


where the summation runs through all divisors of n including d = 1, n. 


Problem 2.20 (Mobius Inversion Formula) Given a function g : N > C, define a 
new function og : N > C bya,(n) = > d\n g(d). Prove that g is recovered from 
Og as g(N) = Yi qj, O(d) > (n/a). 

Problem 2.21 For each m € N, evaluate } 77), (d), where ¢ is Euler’s function. 

Problem 2.22 (Wilson’s Theorem) Solve the quadratic equation x* = 1 in the 
field F, and evaluate the product of all nonzero elements of F,. Deduce from this 
computation that an integer p > 2 is prime if and only if p | (p— 1)! +1. 

Problem 2.23 Describe the sets of all values of polynomials (a) x? — x, (b) x?~!, 
(c) Pan , for x running through F’, and for x running through the set of all squares 
in F,. 

Problem 2.24 How many nonzero squares are there in F,,? Show that the equation 
x? + y? = —1 is solvable in x, y € F, for every prime p > 2. 


See Sect. 2.4.3 on p. 28. 
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Problem 2.25 (Gauss’s Lemma) Write all elements of IF, in order as 


—[(p— 1/2], .... — 1. [0]. []. -... e- 1/2]. 


Prove that a € FF; is a square if and only if an even number of “positive” elements 
become “negative” under multiplication by a. 


Problem 2.26 For what primes p is the equation (a)x* = —1, (b)x* = 2, solvable 


: 1 
in F,? 8 


18 
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Chapter 3 
Polynomials and Simple Field Extensions 


In this chapter, K will denote an arbitrary commutative ring with unit and k an 
arbitrary field. 


3.1 Formal Power Series 


3.1.1 Rings of Formal Power Series 


Given an infinite sequence of elements a; € K, i = 0, an expression of the form 


fe) = Yo ayx” = ay + ayx tax ++ (3.1) 


v20 


is called a formal power series in the variable x with coefficients in K. Two power 
series 


f(x) =an tayxt+ ax? +++» and g(x) = bo +dix+ box? +o (3.2) 
are equal if a; = b; for all i. Formal power series (3.2) are added and multiplied by 


the usual rules for multiplying out and collecting like terms. Namely, the coefficients 
of the series 


f(x) + g(x) = 59 + 1x + sox ++» and f(x)g(x) = po + pixt pox +-° 
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are defined by! 


Sin = dm + Din > Pm = > agbp = aobmn + aD} a abn : (3.3) 
a+p=m 


Exercise 3.1 Check that these operations satisfy the axioms of a commutative ring. 


We write K[x]] for the ring of power series in x with coefficients in K. The initial 
coefficient ap in (3.1) is called the constant term of f. The leftmost summand with 
nonzero coefficient in (3.1) is called the lowest term of f. Its power and coefficient 
are called the lowest degree and lowest coefficient of f. If K has no zero divisors, 
then the lowest term in a product of power series is equal to the product of the lowest 
terms of the factors. Hence, if K is an integral domain, then K[x] is an integral 
domain too. 

The ring K[x,,x2,...,X,| of power series in n variables is defined by induction: 


Kx, 205< 0052) = Kae ess tei | al 2 


It consists of infinite sums of the type YG. hs where the v; run 


n? 


independently through nonnegative integers and a,,_), € K. 


3.1.2 Algebraic Operations on Power Series 


An n-ary operation on K]x] is a map of sets 


K|x] x Kx] x --- x Kx] — KP] 


n 


sending an n-tuple of series fi,fo,...,fn € Kx] to a new series f depending on 
fi. f2,.--,fn. Such an operation is called algebraic if every coefficient of f can be 
evaluated by a finite number of additions, subtractions, multiplications, and well- 
defined divisions applied to a finite number of coefficients of the fj. 

For example, the addition and multiplication defined in (3.3) are algebraic binary 
operations, whereas the evaluation of f at some point x = a@ € K is not algebraic in 
most cases, because it usually requires infinitely many additions and multiplications. 
However, there is an important evaluation that always can be done, the evaluation at 
zero, which takes f(0) = do. 


Formally speaking, these rules define addition and multiplication of sequences (a,), (b,) formed 
by elements of K. The variable x is used only to simplify the visual perception of these operations. 
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For a power series g(x) = b,x + box + --+ without constant term, an important 
unary algebraic operation is provided by the substitution x <4 g(x), which takes 


f(x) € K[>] to 
f(g(x)) = S> ax(bix + box? + ae is 
= ag + ay (dix + box? +--+) + ag(byx + box? +--+)? + a3(bix + Dox” +---/P 
Tee 


= ay + (aybi) +x + (ayba + anb7) «x? + (ayb3 + 2anbi by + a3b}) 8 +, 


whose coefficient at x” depends only on the first m terms of f. 


Proposition 3.1 A power series f(x) = ao + a,x + aox? +++: © K[x] is invertible 
in K|x] if and only if its constant term ay € K is invertible in K. The inversion 
map f +> f~! is a unary algebraic operation on the multiplicative group K|x]* of 
invertible power series. 


Proof If there exists f—!(x) = by +bix+ box? +--+ such that f(x)-f—! (x) = 1, then 
aobo = 1, 1.e., ao is invertible. Conversely, let ag € K be invertible. A comparison 
of coefficients at the same power of x on both sides of 


(ao + aix + aax? +--+) (bo + dix + box? +--+) = 1 


leads to an infinite system of equations in the b;: 


agbo = 1, 

agb + aybo = 0, 
(3.4) 

agbz + a,b, + anbo = 0, 
from which we obtain by = ap! and by = —aq! (ay by—1 + aabp—2 + +++ + agbo) for 
k>1. oO 


Exercise 3.2 In Q[x] compute (1 — x)~!,(1 — x”)“!" and (1 — x)~?. 


3.1.3 Polynomials 


A power series that has only a finite number of nonzero coefficients is called 
a polynomial. The set of polynomials in the variables x;,x2,...,x, form a ring, 
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denoted by 
K[x1,%2,...,Xn] C Kl, %0,--- Xn] - 


A polynomial in one variable x is a finite sum f(x) = ao + aix + +++ + a,x". The 
rightmost nonzero term a,x" and its coefficient a, 4 0 are called the leading term 
and coefficient in f. A polynomial with leading coefficient 1 is called reduced or 
monic. The leading exponent n is called the degree of f and is denoted by deg. 
Polynomials of degree zero are exactly the nonzero constants. It is convenient to 
put deg0 © — oo. If K has no zero divisors, then the leading term of a product fify 
equals the product of the leading terms of f|, fo. Hence, deg(fif2) = deg fi + deg fr 
over such a K. In particular, K[x] is an integral domain, and the invertible elements 
in K[x] are exhausted by the invertible constants. 


Exercise 3.3. Check that y — x divides y" — x” in Z[x, y] and compute the quotient. 


3.1.4 Differential Calculus 


Substitution of x + ¢ for x in a power series 
f(x) = a9 tayxt+ ax +. 
gives a power series in the two variables x, f: 
fetd)=ataaetdt+taaxt+nt-. 


Let us expand and collect terms of the same power in ¢: 


f+) =f) +A@ -t+A@-P+AW P+ =D fm)”. (3.5) 


m20 


This is a power series in ¢ with coefficients f, € K[x], which are uniquely 
determined by f and depend algebraically on f in the sense of Sect. 3.1.2. 

Exercise 3.4 Check that fo(x) = f(x). 

The series f(x) in (3.5) is called the derivative of f and is denoted by f’ or by Lf. 


It is uniquely determined by the condition 


f(x +t) =f(x) +f' (x) -t + (terms divisible by 7) in K[x, f]. (3.6) 
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By Exercise 3.3, the difference f(x+1)—f (x) is divisible by rt in K[{, t]. The constant 
term of the quotient 


f@+ —-f@) — Get 5. Gta =F ue AE 


+ aoe 
t t ‘ t t 
= > ak: ((x Oy ae ee ee a ee ee 
k21 


can be obtained by evaluation at t = 0. This leads to the well-known formula 
f= Shae =a, +2ax+3ayx+---, (3.7) 


k21 


where each multiplier k in front of a,x*~! means the sum of k unit elements in K. 
Note that everything just said makes sense over any commutative ring K with unit. 


Example 3.1 (Series with Zero Derivative) Now assume that K is an integral 
domain. If the characteristic? char K is equal to 0, then formula (3.7) implies that 
f’ = 0 if and only if f = const. However, if charK = p > 0, the derivation 
kills exactly all the monomials x” whose degree m is divisible by p. Thus, for power 
series with coefficients in an integral domain K of characteristic p > 0, the condition 
f'(x) = 0 means that f(x) = g(x?) for some g € K[x]. Moreover, the same is true 
in the subring of polynomials K[x] Cc K[>]. 


Lemma 3.1 Over any prime p € N, the polynomials with zero derivative in F |x] 
are exhausted by the pth powers g”, g € Fp|x]. 


Proof Since the Frobenius endomorphism? F, : F,[x] > F,[x], h t h?, acts 
identically on the coefficients, for every g(x) = box” +b,x"—! ++ +-4+Dn—1x+bm € 
F [x], we have 


B(X?) = box?™ + DyPOY Hee A Dy 1x? + bo 
= bh + BPO DY 4 +P? + BG 
= (box™ + bx"! $06 + Bix + Bm)? = 7 @). 
Oo 


Proposition 3.2 (Differentiation Rules) For a commutative ring K with unit, the 
following equalities hold: 


(af) =a-f’ for everya € K andf,g € K[x], (3.8) 
(f+) =f +e’. for every f,g € K[x], (3.9) 


?See Sect. 2.8 on p. 35. 
3See Sect. 2.8.2 on p. 36. 
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(fg) =f'-g+f-s  foreveryf,g € K|x], (3.10) 
(f(g(x))) = g' (x) -f'(g(x)) for all fand all gwith no constant term, (3.11) 
(fy =-f'/f? for every invertible f € K|x]. (3.12) 


Proof The first two equalities follow directly from (3.7). To verify the Leibniz 
rule (3.10), write 


f(x+t) =f (x) +t-f'(x) + (terms divisible by?’), 


g(x +t) = g(x) +1t-g9'(x) + (terms divisible byz’) . 


Then f(x+de(x+t) = f(x)g(x) +t-(f’ (x) g(x) +f (x) g’ (x) + (terms divisible by??). 
By (3.6), this means that (fg)’ = f’- g +f -g’. The equality (3.11) is proved in a 
similar way. Let t(x, t) = g(x +2) — g(x) = t- g’(x) + (terms divisible by 7”). Then 
f(ge +d) =f(g@) + td) 
= f(g(x)) + t(x, ft) -f’(g(x)) + (terms divisible by t(x, t)”) 
= f(g(x)) + t- 9’(x) - f’(g()) + (terms divisible by 1). 


Therefore (f(g(x)))’ = g'(x) - f’(g(@)) by (3.6). Equality (3.12) is verified by the 
differentiation of both sides of the equality f-f~' = 1. This gives f’.f~'+f-(f =a) = 


0 and forces cmt = —f'/f?. o 
Exercise 3.5 For m € N, show that f,, = 4 a J (x) in formula (3.5) on p. 44. Here 
we write a = (4)" for the m-fold derivation map o fief. 


3.2 Polynomial Rings 


3.2.1 Division 


Perhaps you learned how to carry out polynomial long division in school. It is 
similar to the long division of integers and is applicable in a number of mathematical 
situations. 


Proposition 3.3 (Division with Remainder) Let K be a commutative ring with 
unit and u € K|x] a polynomial with invertible leading coefficient. Then for a given 
polynomial f € K|x], there exist polynomials q,r € K|x| such thatf = u-q+tr, 
and either deg(r) < deg(u) or r = 0. If K has no zero divisors, then q and r are 
uniquely determined by f and u. 
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Proof Letu = boxk + byxk! 4 +++ + Dp_px + by. If deg f < k, we can take gq = 0, 
r =f. Forf = apx" + ayx""! + -+++a,_1x + ay, where n > k, assume inductively 
that g and r exist for all polynomials f of degree degf < n. Since the difference 
f- abs xu has degree less than n, it can be written as gu+r, where either r = 0 
or degr < k. Then f = (¢+ agby !x"—*) -u+-ralso has such a form. Now let K be 
an integral domain and let p, s be another pair of polynomials such that deg(s) < k 
andup +s =f = uq+r.Thenu(qg—p) =r—s.If p—q ¥ 0, then the left-hand 
side has degree at least k, whereas the degree of the right-hand side is strictly less 
than k. Thus, p — q = 0, and therefore r—s = 0. oO 


Definition 3.1 The polynomials g and r from Proposition 3.3 are called the quotient 
and remainder on division of f by u in K[x]. 


Example 3.2 (Evaluation of a Polynomial) For a polynomial f(x) = a,x" + 
Ant | tees Haix tao € K[x], the remainder on division of f by a linear 
binomial x — aw has degree at most zero, 1.e., is a constant. Substitution of x = @ in 
the equality f(x) = (x—@)-q(x) +r leads to r = f(a). Thus, the value of f ata € K 
is equal to the remainder on division of f by x — a. Note that calculation of f(a) by 
the long division algorithm is much faster than simply evaluating all powers a” and 
then adding them together. 


Exercise 3.6 (Horner’s Method) Check that 
f(a) =a +a- (« +a: (a+ tee $s (Ang ++ (Gy—1 +0-a))-~)) 


Corollary 3.1 For a field k and two polynomials f, g € k|x], there exists a unique 
pair of polynomials q,r € k[x] such that f = g-q +r and either deg(r) < deg(g) 
orr=0. Oo 


Proposition 3.4 For a field k and collection of polynomials f,,f2,....fn © kx], 
there exists a unique monic polynomial d € k|x] dividing all f; and divisible by 
every common divisor of all the f;. Moreover, this polynomial d can be written as 


filly + faho +---+faltn where h; € k[x]. (3.13) 


A polynomial g € k|x] can be represented in the form (3.13) if and only if d | g. 
Proof Existence is established by the same arguments as in Sect. 2.4.3 on p. 28. 
Write 


(fi. fa... - fa) = fil + foto +-++ + falta | hi € kL]} (3.14) 


for the set of all polynomials representable in the form (3.13). This is a subring of 
k[x] such that g € (f1,f0,...,fn) forces hg € (fi,fo,.--.fn) for all h € k[x]. Note 
that f; € (fi,f2,...,fn) for all i, every element of (f1,,4,...,jf,) 1s divisible by every 


48 3 Polynomials and Simple Field Extensions 


common divisor of all the f;, and every polynomial in (/1,/4,...,,f,) can be made 
monic by multiplying by the constant inverse of its leading coefficient. Write d for 
any monic polynomial of lowest degree in (f1, /o,....f;,). It is enough to check that 
d divides every g € (fi, fo,....f,). The remainder r = g— qd € (fi, fo,..-,fn) on 
such a division either has degr < degd or vanishes. The first is impossible by the 
choice of d. 

To prove uniqueness, note that given two polynomials d,, dz such that d, | d and 
dy | d, then degd; = deg dy and d;/d2 = const. If both polynomials are monic, 
the constant has to be 1, whence the choice of d is unique. oO 


Definition 3.2 The polynomial d from Proposition 3.4 is called the greatest 
common divisor of the polynomials f; and is denoted by GCD(fi,f2,...,fn)- 


3.2.2 Coprime Polynomials 


By Proposition 3.4, the polynomials f\,o,....m € k[x] are coprime* if and only if 
they have no common divisors except for constants. This is similar to what we had 
in the ring of integers, and for this reason, k[x] and Z share a number of very nice 
divisibility properties. 


Definition 3.3 (Reducibility) Let K be an arbitrary commutative ring with unit. An 
element f € K is called reducible if f = gh for some noninvertible g,h € K. Note 
that all reducible elements are noninvertible. Noninvertible nonreducible elements 
are called irreducible. 


For example, a polynomial f € k[x] is reducible in k[x] if and only if f = gh for 
some g,h € k[x] such that deg g < deg f and degh < degf. 


Exercise 3.7 Let k be a field. Show that g € k[x] is irreducible if and only if 
GcD(d,f) = 1 for all f € k[x] such that degf < deg g. Use Lemma 2.3 on p. 26 to 
prove the factorization theorem for k[x]: each f € Z is equal to a finite product of 
irreducible polynomials, and two such factorizations p\p2---pe = f = 41g2°-* dm 
have k = m and (after appropriate renumbering of factors) p; = 4; - q; for all i and 
some A; € k. 


3.2.3 Euclidean Algorithm 


We may translate the Euclidean algorithm from Sect. 2.2.3 on p.25 word for word 
into the context of k[x]. Given two polynomials fi, 2 € k[x] with deg(f\) = deg(j2), 


4By definition, this means that 1 = Ayfy + hofy +--+ + Myf, for some h; € k[x] (see Sect. 2.3 on 
p. 26). 
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write Ey = fi, FE) = fo, and for k = 1, put 
E, = remainder on division of Ex» by Ex-1 . 


The degrees of E; are strictly increasing until the next E, divides E,—1, and we get 
E,+4, = 0. The last nonzero polynomial E, in the sequence is equal to GCD(fi, 2). 
During the computation, one can write each E; as gx- Eo + hy E1 for some gx, hy € 
k[x]. Then the output will be represented as GCD(fi,fo) = E, = g,f1 + h,fo. The 
next step leads to E,4; = 0 = g,4ifi + Ay+if2, where g,41 and —h,+, are coprime 
associated factors such that LCM(ft,f2) = gr+ift = —Ar+ifr. 


Exercise 3.8 Prove this. 


Example 3.3 Let us carry out the Euclidean algorithm for fj (x) = x7 + 3x° + 4° + 
xt + 3x7 + Sx? + 3x + 4 and fo(x) = 2° + Sxt + 11x? + 12x? + 7x+ 4: 


Eo = x" + 3x5 + 4x? + x4 + 5x? + 303 + 3x +4, 
Ey =x + 5x44 110 + 120? + 7x4 4, 
p= —Ax4 — 1333 — 21x? — 10x — 8 =Ej—- (x? — 2x + 3) Ei: 


it is more convenient first to divide 16£, by E2 and then divide the result by 16: 


Ly 5 1 
E3 = TAGs + 5x° + 10x + 8) = 7g (16Ei + (4x + 7) Ea) 


4x+7 4x3 — x? —2x4+ 5 
= — 
16 16 


the next step leads to the greatest common divisor: 
Ey = —16(x* + 3x + 4) = Ey + 16 (4x— 7) E3 = 16 (x* — 3) Ep 
— 16 (x* — 2x7 + 2x- 2) Kj, 
because 
Es = E3 + («+ 2)- E4/256 = (P° +2° +x4+1)-Eyp—-( 4+ 4+1)-E, =0. 
Thus, 


GcD(fi.ft) =x +3x4+4 = —(x —3) fi) + (4 — 22° + 2x- 2) AO), 
LCM (fi, f2) = Ge +2P 4x4 1) fi@) = (x +24 1) f(x). 
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3.3 Roots of Polynomials 


Definition 3.4 An element a € K is called a root of the polynomial f € K[x] if 
f(a) = 0. By Example 3.2 on p.47, f(a) = 0 occurs if and only if (x — a) divides 
f(x) in K[a]. 


Exercise 3.9 Fora field k and polynomial f € k[x] of degree 2 or 3, show that f is 
irreducible in k[x] if and only if f has no roots in k. 


Proposition 3.5 Let K be an integral domain. If a polynomial f € K{|x] has s 
distinct roots 0\,02,...,d@s € K, then f is divisible by [],(x — a;) in K[x]. In 
particular, either deg(f) = s orf = 0 in this case. 


Proof Write f as f(x) = (x — a1) - g(x) and substitute x = a2, a3, ... , as in this 
equality. Since (a; — a) 4 0 for alli ~ 1 and K has no zero divisors, all the a; for 
i > 2 are roots of g(x). Then we proceed by induction. Oo 


Corollary 3.2. Every nonzero polynomial f with coefficients in an integral domain 
K has at most deg(f) distinct roots in K. Oo 


Corollary 3.3 Let K be an integral domain and suppose f, g € K|x] are each of 
degree at most n. If f(a;) = g(a;) for more than n distinct a; € K, thenf = gin 
K[x]. 


Proof Since f — g has more than n roots but deg(f — g) < n, it must be the zero 
polynomial. Oo 


Exercise 3.10 (Lagrange’s Interpolating Polynomial) For a field k, every 
collection of n + 1 distinct points ao,a),...,d, € k, and arbitrary sequence of 
values bo, bi,...,bn € k, construct a polynomial f(x) € k[x] such that degf < n 
and f(a;) = b; for all i. Prove that such a polynomial is unique. 


3.3.1 Common Roots 


Let k be a field and K D k a commutative ring containing k as a subring. 
Polynomials fi,f2,...,fm € K[x] have a common root a € K if and only if x — a is 
a common divisor of all the f; in K[x]. If h = GCD(fi, fo, ..- fn) € kx] has positive 
degree, then every common root of all the f; is a root of h. Since degh < mindeg fi, 
finding the common roots of a few polynomials often turns out to be easier than 
doing so for each polynomial individually. In particular, if GCD(fi,fo,..-.fin) = 1 
within k[x], then fi, 2, ...,fm have no common roots even in K, because the equality 
fily +fah2 +--+ + finn = 1, which holds for some 1, h2,..., 4m € k[x], prohibits 
the simultaneous vanishing of all the f;(a) at any point a € K. 
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3.3.2 Multiple Roots 


Let k be a field as above and f € k[x]. We say that a € k is a root of multiplicity 
m for f if f(x) = (x— a)” - g(x) in k[x] and g(a) ¥ 0. Roots of multiplicity 1 are 
called simple. Roots of multiplicity m > 2 are called multiple or m-tuple. 
Proposition 3.6 Let k be a field, f € k|x], anda € ka root of f. Then a is a 
multiple root if and only if f'(a) = 0. 

Proof If a is multiple root, then f(x) = (x — a)*g(x). Differentiation of both sides 
leads to f’(x) = (x — a)(2g(x) + (x — a)g’(x)) and f(a) = 0. If @ is simple, 
then f(x) = (x — a)g(x), where g(a) 4 0. Now f’(x) = (x— @)g’(x) + g(x) and 
f(a) = g(a) £0. Oo 
Proposition 3.7 Let chark = 0. A root a € k of a polynomial f € k{x] has 
multiplicity m > 2 if and only if a is an (m — 1)-tuple root of f’. As a consequence, 
a has multiplicity m if and only if 


m—1 


d 
— 77ym—1" 


f= <F(@) a 26 tor Spee, 


dx'™ 


Proof If f(x) = (x— @)"g(x), then f’(x) = (x — w)""! (mg(x) + (x — a)’ (x). For 
g(a) # 0 and since m $ 0, the latter factor is not zero for x = a. This proves the 
first statement. The second follows by induction. Oo 


3.3.3 Separable Polynomials 


A polynomial f € k[x] is called separable if f has no multiple roots in any 
commutative ring K > k. By Proposition 3.6 and what was said in Sect. 3.3.1, 
a polynomial f € k[x] is separable if and only if GcD(f,f’) = 1. Note that this 
condition can be checked by the Euclidean algorithm within k[y]. 


Example 3.4 (Irreducible Polynomials) Let f € [x] be irreducible. Then f is 
coprime to all nonzero polynomials of smaller degree. Therefore, an irreducible 
polynomial f is separable as soon as f’ # 0. Since for chark = 0 and degf > 0 
we always have f’ # 0, every irreducible polynomial over a field of characteristic 
zero is separable. If chark = p > 0, then f’ = 0 if and only if f(~) = g(”) for 
some g € k[x] as we have seen in Example 3.1 on p.45. By Lemma 3.1 on p.45, 
for k = F, all such polynomials are exhausted by the pth powers g’, g € F,[x]. 
In particular, they are all reducible. Therefore, all irreducible polynomials in F,, [x] 
are separable as well. Over larger fields of characteristic p > 0, this may no longer 
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be true. For example, for the field k = F,,(#) of rational functions in the variable® 
t with coefficients in F,,, the polynomial f(x) = x? — t € k[x] can be shown to be 
irreducible.® However, f’ = 0, i.e., f is not separable. 


3.4 Adjunction of Roots 


3.4.1 Residue Class Rings 


Let k be a field and f € k[x] a nonconstant polynomial. The residue class ring 
k[x]/(f) is defined in the same way as the residue class ring Z/(n) was defined in 
Sect. 2.4 on p. 27. We write (f) = {fh | h € k[x]} for the subring of all polynomials 
divisible by f and say that polynomials gi, g2 € kx] are congruent modulo f if 
81 — 22 € (f). In this case, we write g; = gz (mod /f). 


Exercise 3.11 Verify that congruence modulo f is an equivalence relation on k[x]. 


This equivalence decomposes k[x] into a disjoint union of equivalence classes 


isk =st+(/)=tgt+fhl|hekp} 


called residue classes modulo f. Addition and multiplication of residue classes are 
defined by 


[g]+ lJ =[g+A], [s]- (A) = [gh]. (3.15) 


Exercise 3.12 Verify that the classes [g+-/] and [gh] do not depend on the particular 
choice of g € [g] andh € [A]. 


Since the right-hand sides of formulas (3.15) deal with ordinary operations within 
k[x], the addition and multiplication of residue classes satisfy the axioms of a 
commutative ring with unit. The zero element of the ring k[x]/(/) is [O]y = (f), and 
the unit element is [1]; = 1 + (f). The homomorphism k ~ k[x]/(f), c + [el;, 
which sends c € k to the residue class of the constant polynomial c € kj[x], is 
nonzero and therefore injective by Proposition 2.2 on p. 34. In what follows, we 
identify k with its image under this embedding and write c instead of [cl for c € k. 


Exercise 3.13 Show that k[x]/(— a) ~ k foralla ek. 


Since every g € k|x] is uniquely expressed as g = fh + r, where either degr < 
degf or r = 0, every nonzero residue class [g]y has a unique representative r € [g], 


>See Sect. 4.2 on p. 76. 
With our current equipment it is not so obvious. However, it follows at once from Gauss’s lemma 
(see Lemma 5.4 on p. 117) by means of an Eisenstein-type argument (see Example 5.8 on p. 119). 
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such that deg(r) < deg(f). Therefore, each residue class can be written as 


[ag + ax +++ Gy 1x] = ag +a +--+ 4,10", 


def 


where } = [x], a; € k, and the equality 


ay + ay} +++ + ay 0" | = by +8 ++ +d, 0"! 


holds in k[x]/(f) if and only if a; = b; in k for all i. 

Note that ¥ = [x]; is a root of f, because f(v) = f (x1) = [@]- = [0]- in 
k[x]/(f). For this reason, the residue class ring k[x]/(f) is often called an extension 
of k by adjunction of the root 0 of the polynomial f € kx]. From this viewpoint, 
addition and multiplication of residue classes can be treated as an ordinary algebraic 
manipulations with formal expressions 


do tad te tayo! (3.16) 


obeying the standard rules for multiplying out and collecting like terms except for 
the one extra relation f(}) = 0 on the symbol #. 

For example, the elements of the residue class ring Q[x]/(x” — 2) are represented 
by formal expressions of the form a + bV/2, where /2 & [x] satisfies the relation 
(/2)” = [x]? = [x?] = 2. Under this relation, the addition and multiplication of 
such expressions are completely determined by the associative, commutative, and 
distributive laws: 


(a+ bvV2)+(c+dV2)= (atc) + (b+4) v2, 
(a+ bV2)(c+dV2) = (ac + 2bd) + (cb + ad) V2. 


Exercise 3.14 Verify that Q[x]/(x? — 2) is a field. Is the same true for residue class 
rings (a) Q[x]/(x? + 1) and (b) Q[x]/(3 + 2)? 


Proposition 3.8 For a field k and nonconstant polynomial f € k[x], the residue 
class ring k[x]/(f) is a field if and only if f is irreducible in k[x]. 


Proof If f = gh, where deg g, degh < deg f, then both classes [g],, [h], are nonzero 
but have zero product in k[x]/(f). This prevents the latter from being a field. If 
f is irreducible, then for every g ¢ (f), we have GCD(f,g) = 1, and therefore 
jh+ gq = 1 for some h, g € k[x]. This forces [g] - |g] = [1] in k[x]/(f). Thus, el 
nonzero residue class [g] € k[x]/(/) is invertible. 


Exercise 3.15 In Q[x]/(x?+x+ 1), write an explicit formula for the inverse element 
of } — a, where a € Q and? = [x]. 
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Proposition 3.9 (Chinese Remainder Theorem) Let k be an arbitrary field and 
Ff € kx] a product of m mutually coprime polynomials: 


f =fih-++fn, whereVi#j, GCD(f,f)) = 1. 


Then the isomorphism of commutative rings 
g = Kkbel/(f) > (&bd/(f)) x Ed / 2) x +++ x (KE]/ ind) 
is well defined by [gly > ([gln. [gla --- » (gle): 


Exercise 3.16 Use the arguments from Sect. 2.7 on p. 34 to verify that ¢ is a well- 
defined injective ring homomorphism. 


Proof (of Proposition 3.9) It remains to verify that @ is surjective, i.e., that for every 
collection of residue classes [r;] € k[x]/(jj), there exists a polynomial g € k[x] such 
that g = r; (mod f;) for all i simultaneously. We proceed as in Example 2.7 on p. 35. 
For each i, write F; = [Lust Jv for the product of all f, except fj. Since f; is coprime 
to all those f,, by Lemma 2.3 on p. 26 it is coprime to F; as well. Hence, there exists 
some’ h; € k[x] such that F;h; = 1 (mod f;). Clearly, F;h; = 0 (mod f,) for all 
v # i. Thus, 0, r{Fih; = r; (mod f;) for all i, as required. Oo 


3.4.2 Algebraic Elements 


Let k C F be an arbitrary extension of fields. An element ¢ € F is said to be 
algebraic over k if f(€) = 0 for some nonzero polynomial f € k[x]. The monic 
polynomial of minimal degree with this property is called the minimal polynomial 
of ¢ over k and is denoted by jr. Every polynomial f € k[x] such that f(¢) = 0 is 
divisible by jz¢, because division with remainder leads to f(x) = g(x) c(x) + r(x), 
where either r = 0 or degr < degyi¢, and the latter case is impossible, since 
r(f) = f() — q(S)uc(S) = 0. Therefore, the minimal polynomial j1; is uniquely 
determined by ¢. It is irreducible, because a factorization wz = g- h would force 
g(¢) = 0 or h(f) = 0, which is impossible for deg g, degh < deg pw. 

For an element ¢ ¢€ F, the evaluation map ev; : kx] > F,f b% f($), 
is a ring homomorphism. Its image im ev; C F is the minimal subring, with 
respect to inclusion, in F containing k and ¢. It is usually denoted by k[¢]. If ¢ is 
algebraic over k, then the above arguments show that kerev; = (j1¢) consists of all 
polynomials divisible by the minimal polynomial of ¢. Hence, the evaluation map 
can be factorized through the quotient homomorphism k[x] —> k[x]/(jz) followed 
by the inclusion of fields g : k[x]/(uc) <> F mapping [f] > f(¢). Indeed, ¢ 


7This polynomial can be computed explicitly by the Euclidean algorithm applied to F; and fj. 
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is well defined, because g = f + hu forces g(f) = f(¢), and it is injective by 
Proposition 2.2 on p. 34, because k[x]/(j1¢) is a field. We conclude that the smallest 
subring k[¢] C K containing k and ¢ is a field isomorphic to the quotient k[x]/ (juz). 

For example, the smallest subring in R containing Q and the real number /2 € R 
is a field isomorphic to the field Q[x]/ Ge — 2) considered before Exercise 3.14 on 
p. 33. 


Theorem 3.1 For a field k and polynomial f € k|x] of positive degree, there exists 
a field F > k such that in F |x], the polynomial f can be factored completely into 
a product of degf linear factors. Equivalently, f acquires deg f roots (counted with 
multiplicities) in F. 


Proof By induction on n = degf. For n = 1, the statement is trivial. Assume that 
it holds for all fields k and all polynomials of degree less than n. Consider a field k, 
and let f € k[x] have degree n. If f = gh, where deg g, degh < n, then by induction, 
there is a field F’ > k such that g is a product of deg g linear factors in F’ [x]. 
Applying the inductive hypothesis once more to F’ and h, we pass to a field F 5 F’ 
over which h is completely factorizable as well. Then f = gh also can be written as 
a product of deg f = deg g + degh linear factors in F[x]. If f is irreducible, we put 
F’ = khx]/(f) D k. Since f acquires a root } = [x] € F’, f becomes divisible by 
(x — #) in F’ [x], and we can repeat the previous arguments. Oo 


3.4.3 Algebraic Closure 


A field k is said to be algebraically closed if every polynomial f € k[x] has a 
root in k. Equivalently, we can say that every f € k[x] is completely factorizable 
into a product of degf linear factors (some of which may coincide). One of the 
most important examples of an algebraically closed field is provided by the field of 
complex numbers C. We define and study the field C in the next section. A proof of 
its algebraic closure® will be sketched in Problem 3.34 on p. 70. 


3.5 The Field of Complex Numbers 


3.5.1 The Complex Plane 


We write C = R[d|/(?? + 1) for the extension of R by a root of the irreducible 
quadratic polynomial 7? + 1 = 0. Therefore, C consists of residue classes [x + yf] = 


x +y-i, where x,y € R andi © [4] satisfies the relation i? = [17] = [-1] = —1,ie., 


8 Although this fact is widely known as the fundamental theorem of algebra, its whys and 
wherefores are rather analytic-geometrical. 
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it could be written as i = /—1. Addition and multiplication in C are described by 
the formulas 


(x1 + iy) + (2 + iy2) = (1 +42) + i101 + y2), eat 
(x1 + iyi) (x2 + iyo) = (1x2 — y1y2) + i1y2 + 291). , 


The inverse of a nonzero complex number x + yi € C is 


1 x y 


x+ yi x2 + y? x+y? 


The elements of C can be depicted in the Euclidean plane R? equipped with a 
rectangular coordinate system XOY (see Fig. 3.1). A complex number z = x + yi 
is represented there by a radial vector joining the origin O = (0,0) to the point 
z = (x,y). We write |z| = /x? + y? for the length? of this vector. The coordinates 
x and y are called the real and imaginary parts of z and are denoted by Re(z) = x 
and Im(z) = y. Let 


Arge(z) =a+2nZ = {a+2nke R|ke Z} (3.18) 


be the set of all % € R such that a rotation of the plane about the origin through the 
angle 3} (measured in radians) moves the ray OX to the ray Oz. All these angles are 
congruent modulo integer multiples of 27 and are equal to oriented! lengths of arc 
of the unit circle S' going from (1, 0) to the intersection point of S! with the ray Oz. 
We write a € Arg(z) for the oriented length of the shortest among such arcs. The 
congruence class (3.18) is called the argument of z. Thus, each z = x + yi € C has 
Re(z) = |z|-cosa@, Im(z) = |z|-sina, and can be written as z = |z|-(cosa+i-sina). 


Lemma 3.2. The radial vectors of the points z € R? form a field with respect to the 
usual addition of vectors'' and multiplication defined by the rule that lengths are 
multiplied, arguments are added, i.e., by the formulas 


Iz122| = [zi] + |zl, 


Arg(ziz2) = Arg(z;) + Arg(z2) = {01 + 02 | 1 € Arg(zi), 2 € Arg(z2)}. 
(3.19) 


This field is isomorphic to C. The isomorphism takes the complex number x+iy € C 
to the radial vector of the point z = (x, y) € R?. 


Also called the modulus or absolute value of z. 


'0That is, taken with a positive sign when an arc goes counterclockwise and with a negative sign 
otherwise. 


‘See Example 2.4 on p. 22. 
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2S Rel yrr 


lzl=/2? +y? 


Arg(z)=a+2rk , keZ 


S={(o,y) |e ty7=1} 


eee. ¢ 


|z—1|=|z|-1, Arg(z7!)=—at2rk , kEZ 


Fig. 3.1 Geometric ingredients of a complex number 


Exercise 3.17 Check that the addition of arguments used in the second formula 
of (3.19) is well defined in the sense that the right-hand side is actually a congruence 
class modulo 27 - Z, as it must be by formula (3.18). 


Proof (of Lemma 3.2) We have seen in Example 2.4 on p. 22 that vectors form an 
additive abelian group. Multiplication (3.19) is clearly commutative and associative. 
The unit direction vector of the OX-axis, which has length 1 and argument 0, is the 
neutral element for multiplication. The inverse of a nonzero vector z is the vector 
z!, which has |z~'| = 1/|z| and Arg(z_!) = —Arg(z) (see Fig. 3.1). Therefore, 
the nonzero vectors form a multiplicative abelian group whose unit element differs 
from zero. It remains to check distributivity. The multiplication map A, : z + az, 
which multiplies all vectors by some fixed vector a # 0, is a rotary dilation!” of 
R? at the origin by the angle Arg(a) and scaling factor |a|. The distributive law 
a(b + c) = ab +c says that a rotary dilation respects the addition of vectors, 
Le., Ag(b +c) = Ag(b) + Ag(c). This is true, because both rotations and dilations 
take parallelograms to parallelograms. Hence, the radial vectors of points in R? 


That is, the composition of a rotation and dilation with the same center (since they commute, it 
does not matter which one was done first). 
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form a field. It contains a subfield formed by the vectors parallel to the OX-axis. 
Let us identify this subfield with’? R and write i for the unit direction vector of the 
OY-axis. The radial vector of a point z = (x,y) € R? can be uniquely expressed as 
z= x+i-y, where x, y € R (i.e., they are parallel to the OX-axis), and the operations 
+,-are those from formula (3.19). It follows from (3.19) that i? = —1. Thus, by 
distributivity, the addition and multiplication of vectors x; + iy; and x2 + iy2 are 
described by the same rules (3.17) as the addition and multiplication in the quotient 
field C = R[d/(? + 1). o 


Agreement 3.1 From this point on, we shall identify the field C = R[¢/(? + 1) 
with the field of radial vectors in R. We shall refer to the complex plane when we 
switch to a geometric interpretation of the complex numbers. The coordinate axes 
OX and OY are called the real and imaginary axes of the complex plane. 


3.5.2 Complex Conjugation 


The complex number Z = x — iy is called the conjugate of the complex number 


z= x+iy. Since Z@ = x* + y’ = |z|?, we get the very convenient inversion 
formula z~'! = Z/|z|?. Geometrically, the conjugation map C > C, z + Z, is the 
reflection of the complex plane in the real axis. Algebraically, complex conjugation 
is an involutive'* automorphism of the field C over the subfield R C C. This means 
that Zz) FZ = Z) + 2 and Zo = Zz forall z},z € C,zZ = z forall z € C, and 
z=ZifandonlyifzeR. 


3.5.3 Trigonometry 


You probably studied trigonometry in high school and perhaps even recall some of 
that depressing conglomeration of trigonometric identities. If so, your long night- 
mare is over. Those trigonometric identities are just simple polynomial expressions 
in z € C restricted to the unit circle S' C R and rewritten in terms of the real 
and imaginary parts x, y now given the names cos@ and sing. In most cases, this 
separation into sine and cosine has the effect of making a simple expression more 
complicated. 

For example, consider two complex numbers z} = cosg; + ising), 2 = 
COS @2 + isin @2 on the unit circle, where gy, € Arg(z1), g2 € Arg(z2). Then their 
product, computed by (3.19), is z}z2 = cos(g; + ~2) + isin(g; + 2), whereas the 


'3Note that the multiplication in R agrees with that defined in (3.19). 
'4 sn endomorphism 1 : X — X of a set X is called an involution if to 1 = Idx. 
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formula (3.17) leads to 
222 = (cos Q1 COS @2 — sing; sin 92) + i(cos ¢ 1 SiN ~2 + sing; Cos 2). 


Comparison of the real and imaginary parts proves at once the two most famous 
trigonometric formulas: 


cos(y, + 2) = COS GY} COS G2 — SiN GY] Sin G2 

sin(g, + ~2) = cos qQ sing + sin g| COS Po. 
Example 3.5 (Multiple Angles) Let us take z = cos + ising and compute z” = 
cos(ng) + isin(ng) by expanding (cos yg + ising)" via the binomial formula (1.7) 
on p. 6. We get 


cos(ng) + isin(ng) = (cosy + ising)” 


n (7 n—-1 : n n—2 cae (| 7 n—-3 oo 3 
= cos” g + () cos” g sing — (') cos” “gp sin® g — (3) cos” ” g sin” @ 
= ((°) cos” p — (') cos”? g sin? gy + () cos” * gsint g —--- 


This equality describes the entire trigonometry of multiple angles: 


cos(ng) = (') cos” p — (') cos”? Q sin? got ( cos” 4 Q sin* pimses 


: = n n-1 : — n n—-3 s3 n n—-5 25 
sin(ng) = 1 cos @~ sin @ 3 cos @ sin gt+ 5 cos g sin’ @ 


For example, cos 3y = cos* y — 3 cos@- sin? g = 4cos* g — 3cos*y. 


Exercise 3.18 Express sin(27/5) and cos(27/5) in terms of roots of rational 
numbers. 
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3.5.4 Roots of Unity and Cyclotomic Polynomials 


Let us solve the equation z’ = 1 in C. Comparison of absolute values leads to 
|z"| = |z|” = 1 and forces |z| = 1. Comparison of arguments gives nArg(z) = 
Arg(1) = {27k | k € Z}, meaning that Arg(z) = {22k/n | k € Z}. Therefore, the 
polynomial z” — | has exactly n distinct complex roots 


C, = cos(2ak/n) + isin(2zk/n), wherek = 0, 1,...,(n—1). 


They form the vertices of a regular n-gon inscribed in the unit circle in such a way 
that 9 = 1 (see Fig. 3.2 on p.60). These roots form a multiplicative abelian group 
called the group of nth roots of unity and denoted by wt,,. The group p,, is isomorphic 
to the cyclic group of order n from Example 1.8 on p. 13. 

A root ¢ € pL, is called primitive’ if its powers ¢*, k € N, exhaust the whole of 
LL,,- For example, the root of minimal positive argument 


€; = cos(2z/n) + isin(22/n) 
is always primitive. All four nontrivial roots of 5 are primitive (see Fig. 3.2). In 
[g, there are just two primitive roots, ¢; and ¢; = ¢7' (see Fig. 3.3 on p. 61). 
Ay 


Z1 = cos (22) +isin (27) 


rw 


230 


& 


Fig. 3.2. Roots of 2 = 1 


Exercise 3.19 Verify that the root ¢* = cos(2xk/n) + isin(2zk/n) is primitive in 
i, if and only if GCD(k,n) = 1. 


In other terminology, a generating root of unity. 
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The monic polynomial whose roots are exactly the primitive nth roots of unity is 
denoted by 


®(2= [[ (@-4) (3.20) 


1<k<n: 
GcD(k,n)=1 


and called the nth cyclotomic polynomial. For example, the fifth and sixth cyclo- 
tomic polynomials are 


6;(2) = (@-u)@-az-ale-wH=e+otetzt i, 
O6(2) = @-u)z@—-u) =2-z4t 1. 


One can show!® that all cyclotomic polynomials are irreducible and have integer 
coefficients. Some basic properties of ®, are listed in Problem 3.32 on p. 70 below. 


Y4 


) +isin (¢) 


Fig. 3.3 Roots of 2° = | 


Example 3.6 (The Equation z" = a) The complex roots of the equation z” = a, 
where a = |a| - (cosa + isina) # 0, are the numbers 


a + 27k a+ 20k 
R= Val (cos @ 7 =" + i. sin 7"), O<k<n-l. 
n n 


'6This is not so easy, and we shall return to this problem in the forthcoming Algebra II. 
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They form the vertices of a regular n-gon inscribed in a circle of radius ¥/|a| 
centered at the origin and rotated in such a way that the radial vector of one of 
vertices forms the angle a/n with the OX-axis. 


3.5.5 The Gaussian Integers 


The complex numbers with integer coordinates form a subring in C denoted by 
Zii] = {z = x+iy | x,y € Z} and called the ring of Gaussian integers. The Gaussian 
integers are widely used in number theory. For example, they clarify the following 
classical problem: which integers m € Z can be written in the form x” + y” for some 
x,y € Z? Since x? + y? = (x + iy)(x — iy) in Z[i], the solvability of the equation 
m= x? + y* inx,y € Z is equivalent to the solvability of the equation m = z-Zin 
z € Zi. If the latter is solvable for some mj, m2 € Z, i.e., 


m= a + bi = (a; + ib\)(q, — iby) = 271, 
mM = as + bs = (dy + iby) (a2 — ib2) = 22%, 
then it is solvable for the product m = mm as well: 
m = 2122 * Za = |z122|” = (aiby — agb2)” + (aybz + arbi)’. 


This reduces the question to the representability of prime numbers. We postpone 
further analysis until Example 5.6 on p. 115. 


Exercise 3.20 Show that Zi] has exactly four invertible elements: +1 and +i. 


3.6 Finite Fields 


3.6.1 Finite Multiplicative Subgroups in Fields 


In this section we consider abelian groups with a multiplicative group operation. 
Such a group A is called cyclic if there exists an element a € A whose integer 
powers a”, n € Z, exhaust the whole group. An element a € A possessing this 
property is called a generator of the cyclic group A. For example, the group of nth 
roots of unity 4, C C is cyclic, and its generators are the primitive roots.!7 

If A is a finite group, then the integer powers of an element b € A cannot all be 
distinct, i.e. b’ = b” for some k > m. This forces b‘"”" = 1. The minimal m € N 
such that b” = | is called the order of b and is denoted by ord(b). If ord(b) = n, 
then the n elements b° & 1, b! = b, b*, ..., b"~! are distinct and exhaust the set of 


See Sect. 3.5.4 on p. 60. 
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all integer powers b”, because b™7+" = (b")1b" = b" for everym = ng+r€ Z. 
Note that b generates A if and only if ord(b) = |A|; thus the cardinality of the cyclic 
group coincides with the order of every generator of the group. 


Theorem 3.2 Every finite subgroup A of the multiplicative group k* of a field k is 
cyclic. 


Proof Letm = max e, ord(b). If the order of every element in A divides m, then all 
of them are roots of the polynomial x” — 1 = 0. Since this polynomial has at most m 
roots in the field k by Corollary 3.2 on p.50, we get the inequality |A| < m, which 
forces every element of order m to be a generator of A. To prove that the orders of all 
elements in A divide the maximal order, it is sufficient to construct an element b € A 
of order LCM(m, mz) for any two elements b,, bz € A of distinct orders m1, mo. 


Exercise 3.21 Use Exercise 2.8 to find coprime nj,nz € N such that my = kin, 
mz = kpno, GCD(k,, 11) = GCD(kp, n2) = 1, and LCM(m,, m2) =NnjNn. 


Let a; = ee a= ae where kj, kz are from Exercise 3.21. Then ord(a,) = ny; 
and ord(a2) = nz are coprime. Let us check that ord(a;a2) = n,n2, as required. If 
(aya) = 1, then aj = a,” “ and ao = a}'° = 1. Therefore, n) (nz — c) ino. 


Hence, c:n 2. By symmetry, cin, as well. Since n, and nz are coprime, cinjn2. O 


3.6.2 Description of All Finite Fields 


For an irreducible polynomial f € F,[f] of degree n, the residue field F,[¢]/(/) 
consists of p” elements of the form aj + ay +--+ + a,_\0"!, where a; € F, and 
F(8) =0. 

For example, the polynomial #? + 1 € F3[f] is irreducible, because it has no 
roots in F3. Therefore, the residue ring Fo & F3/ (t? + 1) is a field of nine elements 
a+ bi, where a,b € F3 = {—1,0,1} and i® [#4] satisfies i? = —1. The field 
extension F3 C Fo» is analogous to the extension R C C. For example, the Frobenius 
automorphism F3 : Fy > Fo, a+ a’, takes a + bi to a— bi and looks like complex 
conjugation. 

Similarly, the polynomial 7? ++ 1 € F2[f] is irreducible, because it has no roots 
in F). The field Fy “ F [¢/(#? + t+ 1) consists of four elements!*: 0,1, © [#], 
and 1+ @ = w* = w~!. The extension F, C Fy4 also becomes analogous to the 
extension R C C as soon we represent the field C as the extension of R by a root of 
the cyclotomic trinomial ®3(t) = #? + t+ 1,ie., as!? 


C=Ri/(? +t4+ 1) = {u+wo|u,weR, o = (-1+ivV3)/2€ Ch. 


'8Note that the equality —1 = 1 in F, allows us to avoid the minus sign. 
'°The standard Cartesian coordinates (x, y) in C are related to the “triangular coordinates” (u, w) 
of the same point x + iy = z=u+ wo byx =u—w/2,y= V3w/2. 
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2. 2 


Again, the Frobenius automorphism F, : Fy > Fy,abh a*,mapswah © = @ 
and is similar to complex conjugation: both act identically on the subfields being 
extended and swap the roots of the trinomial 77 + t+ 1. 


Exercise 3.22 Write down the multiplication tables and the tables of inverse 
elements for the fields F, and Fo. List all squares, all cubes, and all generators 
of the multiplicative groups F7 and F3. 


Lemma 3.3 For finite fields k C F, there exists n € N such that |F| = |k|". 


Proof We use induction”? on the difference |IF| — |k|. The case |F| = |k| is trivial. If 
there exists ¢ ¢ F\k, then € is algebraic over k, because ¢* = ¢° for some k # @ in 
the finite field F. We know from Sect. 3.4.2 on p. 54 that there is a subfield k[¢] Cc K 
isomorphic to k[x]/(j4z), where ze € k[x] is the minimal polynomial of ¢ over k. 
Let deg wz; = m. Then k[¢] consists of |k|’” elements ay + ay€ + +++ + dm—16""', 
a; € k. By the inductive hypothesis applied to the extension k[¢] C F, there exists 
n € N such that |F| = |k[&]|" = |k|’"". Oo 


Corollary 3.4 The cardinality of a finite field is equal to a power of its character- 
istic. oO 


Theorem 3.3 For every n € N and prime p € N, there exists a finite field F, of 


characteristic p and cardinality q = p”. 


Proof Consider the polynomial f(x) = x4 —x € F,[x]. By Theorem 3.1, there exists 
a field F > F, such that f acquires n roots in F. Since f’ = 1, all these roots are 
distinct. In other words, there are exactly q distinct elements w € F such that a7 = 
a. They form a field, because a7 = a implies (—a)? = —a and (a7! 7 = @!, and 
if B = B4, then wB = a7? = (af)? anda + B =a?" + BP = Fi(a) + Fy (B) = 
Fi(a+B) = (a+B)?" , where F,: F > F,x + 2’, is the Frobenius endomorphism. 

oO 


Theorem 3.4 Two finite fields are isomorphic if and only if they have equal 
cardinalities. 


Proof Let F be a field with |F| = g and char F = p. Then gq = p” by Corollary 3.4. 
It is enough to show that F is isomorphic to the field F, constructed in the proof 
of Theorem 3.3. The multiplicative group F* is cyclic by Theorem 3.2 on p.63. 
Choose some generator ¢ of F* and write jz; € F,|[x] for the minimal polynomial 
of ¢ over F,. Thus, F = F,[€] is isomorphic to F,[x]/(j4c), as we have seen in 
Sect. 3.4.2 on p. 54. Since the polynomial f(x) = x? — x vanishes at C, it is divisible 
by uz in F, [x], 1e., f = weg, where deg g < degf. Since f has q distinct roots in Fy, 
the polynomial j4; should have at least one root & in F,, since otherwise, g would 
have too many roots. As soon as 4¢(€) = 0, the assignment [h] +> A(&) gives a 


20We take such a roundabout way because the vector-space machinery will not appear until Chap. 6. 
Indeed, F is a finite-dimensional vector space over k C F, and Lemma 3.3 follows at once from 
Corollary 6.3 on p. 133. 
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well-defined homomorphism F,,[x]/(z¢) <> F,. It maps [1] +> 1 and therefore is 
injective by Proposition 2.2 on p.34. Since both fields consist of g elements, it is 
surjective as well. Thus, the fields F, F, are isomorphic to the same residue field 


F,[x]/ (ue). o 


3.6.3 Quadratic Residues 


Fix a prime integer p > 2. The squares in the multiplicative group F; are 
called quadratic residues modulo p. All the other elements in KF are called 
quadratic nonresidues or just nonresidues for short. The quadratic residues form a 
multiplicative subgroup in 88 , the image of the multiplicative group homomorphism 
FF => Fe x} x2. The kernel of this homomorphism consists of two elements, 
because x* = 1 exactly for x = +1 in the field F,. We conclude that there are 
precisely (p — 1)/2 quadratic residues in F. 

Fermat’s little theorem! allows us to check whether a given residue a € gs 
is quadratic. Namely, since a?! = 1 for alla € Bes the image of another 


homomorphism of multiplicative groups, 
FeO FS, xprex® DP, (3.21) 


lies inside the roots of the polynomial x* — 1. The equation x?~)/? = 1 has at most 
(p — 1)/2 < p—1 roots in F,,. Thus, the homomorphism (3.21) maps F onto {+1} 
surjectively. Therefore, its kernel consists of (p — 1)/2 elements. Since all quadratic 
residues lie in the kernel, we conclude that a € F, is a quadratic residue if and only 
if a@?—))/? = 1, For example, —1 is a square in F,, if and only if (p — 1)/2 is even. 

For n € N and prime p > 2, write [n], € F, for the residue class of n. The 
quantity 


1, if|n], € FB is a quadratic residue, 
(*) Speer =< 0, iff], =0, (3.22) 
Pp 


—1, if |], €¢ F, is a quadratic nonresidue, 


is called the Legendre—Jacobi symbol of n modulo the prime p. It depends only on 
n(mod p) and is multiplicative in n in the sense that 


Gace: 


?1See Theorem 2.1 on p. 30. 
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Exercise 3.23*. Prove that for every prime p > 2, one has ( 2) = | if and only if 
p = +1 (mod 8). 

The Legendre—-Jacobi symbol can be computed easily thanks to the following 
quadratic reciprocity law, discovered by Euler and first proved by Gauss: 


P q poiget ; : 
—}-{—)=(€1)2 7? | forall prime integers p,q > 2. (3.23) 
q P 


Two proofs of this theorem found by Eisenstein and Zolotarev are sketched 
in Problem 3.38 on p.71 and in Problem 9.7 on p. 225. Here is an example of how 


it works: 
ST) _ (179\ _f 8\ 2) —3 
179) \57) \s7) \57) 


Thus, 57 is a square modulo 179. 


Problems for Independent Solution to Chap. 3 


Problem 3.1 Show that x” — 1 divides x” — 1 forn | m. 

Problem 3.2 In the ring Z[x], find the residues on division of x!7° + x°7 + x7 + 1 by 
a’?-1Lbxrt+i@x’+xt+1. 

Problem 3.3 Find all multiple complex roots of the polynomial 


x) +7x° —36x* + 15x° — 216x° + 9x 324. 


Problem 3.4 Check whether R[x]/(/) is a field for (a) f = x4 + 1, (b) f = 2° +1, 
(ce) f =2? +3. 

Problem 3.5 Given 0 € C, write Q[3] C C for the smallest subfield containing ?. 
Are there any isomorphic fields among Q[/2], Q[ V3], and Q[</2]? 

Problem 3.6 Find the minimal polynomial” of (a) 2— 3i € C over R, (b) /2+ 
V3 € Rover Q. 

Problem 3.7 For the finite field F,, show that every function F, — F, can be written 


as a +> f(a) for an appropriate f € F,[x] and give an example of two different 
polynomials producing the same functions F, > F,. 


2See Sect. 3.4.2 on p. 54. 


3.6 Finite Fields 67 


Problem 3.8 Find a monic polynomial of the lowest possible degree with coeffi- 
cients in Z/(n) that produces the zero function Z/(n) — Z/(n) for (a) n = 101, 
(b) n = 111, (e)n = 121. 

Problem 3.9 Show that for every field”* k, the polynomial ring k[x] contains 
(a) infinitely many irreducible polynomials, 

(b) an irreducible polynomial of every degree. 

Problem 3.10 List all irreducible polynomials of degree < 5 in F [x] and all 

irreducible polynomials of degree < 3 in F3[]. 


Problem 3.11 How many irreducible polynomials of degrees 3 and 4 are there in 
F 3 [x] ? 

Problem 3.12 Use an appropriate modification of Mébius inversion~ to prove that 
IF, [x] has exactly i yan p41(n/d) irreducible polynomials of degree n. 


Problem 3.13 Write F, for a finite field of g = p” elements and F, C F, for 
its prime subfield. Show that the order of every element in the multiplicative 
group F divides q — 1. Use Problem 3.1 to show that for each d | n, there 
are exactly d distinct elements x € ps satisfying the equation x7 = 1. Then 
use an appropriate modification of the Mobius inversion formula” to compute 
the number of elements of order d in F*. In particular, find the total number of 
elements of order (g — 1) and deduce from this*° that the multiplicative group Fi 
is cyclic. What can the degree of the minimal polynomial for an element of order 
(gq — 1) in F* be? 

Problem 3.14 Show that for a field k of characteristic p > 0 anda ¢€ k, the 
polynomial x” — a either is irreducible in k[x] or has a root of multiplicity p in k. 


Problem 3.15 Let the polynomial f(x) = x? —x—a ¢€ F,[x] have a root ¢ in 
some field F > F,,. Find another p — 1 roots of f in F and prove that in F,[x], 
the polynomial f is either irreducible or completely factorizable as a product of 
linear binomials. 


Problem 3.16 Show that every polynomial f € R[x] is a product of linear binomials 
and quadratic trinomials with negative discriminants. Write such a factorization 
for f = x® + 128. 


Problem 3.17 (Viéte’s Formulas) Given a monic polynomial 
F(x) = Fax Fe Hag x + an = (x — 1) (X— 2) ++ K—), 


express its coefficients a, in terms of the roots a, and determine a constant } = 
U (a), a2,...,@n) such that f(t — 0) has no term of degree n — 1. 


Especially for a finite one. 
24See Problem 2.20 on p. 39. 
5See Problem 2.20 on p. 39. 
6Independently of Theorem 3.2 on p. 63. 
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Problem 3.18 (Discriminant) In the notation of Problem 3.17, the quantity 


Dy a | [@ = aj)” 


i<j 


is called the discriminant of the monic polynomial f(x) = [ |(x—«@;). Express the 
discriminants of the trinomials (a) x? + px + q, (b) x? + px + q, as polynomials 
in p, q. 

Problem 3.19 Prove that the cubic trinomial f(x) = x° + px + q € R[x] has three 
distinct real roots if and only if its discriminant Dy is positive. Show that in this 
case, there exists 4 € R such that the substitution x = At reduces the equation 
f(x) = 0 to the form 4° — tx = a, where a € R has |a| < 1. Use the expression 
for cos(3@) in terms of cos g from Example 3.5 on p. 59 to solve 4° —tx = ain 
trigonometric functions of a. 


Problem 3.20 Solve the cubic equations (a) x*—3 x+ 1 = 0, (b) x? +x?—2 x-1 = 0, 
in trigonometric functions. 


Problem 3.21 Find the real and imaginary parts, modulus, and argument of the fol- 
lowing complex numbers and depict them as accurately as you can in the complex 


plane: (a)(5 + (7 ~ 61) /(3 + A), (V(1 + 5/18, (VF +)/—9) 


Problem 3.22 Using only the four arithmetic operations and square roots of positive 
real numbers, write explicit expressions for the real and imaginary parts of the 
roots of the quadratic equation z? = a. 


Problem 3.23 Find all complex solutions of the following equations: 

(a) J + Qi-7)z+ 3-1 =0, 

(b) 2 =3 

fo lage Tl 

(d) (¢+ i)" +(z—-i)" =0, 

(e) Z=2'. 

Problem 3.24 (Euler Product) Show that for every odd m e€ N, there exists 


fn € Q{x] such that sin mx/ sin x = f,,(sin? x). Find the degree, roots, and leading 
coefficient of f,,. Show that 


(a) = (—4)°r T= a (sin? x — sin? (2)), 
(b) Cay “r sin(mx) = 2"! Tz > sin (« + 221), 


nen 


Problem 3.25 For all s,n € N, evaluate the sum and product of the sth powers of 
all nth roots of unity in C. 
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Problem 3.26 Evaluate the following sums: 


(a) sinx + sin2x +---+ sinnx, 

(b) cosx + 2cos2x +---+ncosnx, 

© (J+(M+G@t--. 

@ ()+()+G)+-- 

©) ()-3Q+5G)+-- 

Problem 3.27 Show that three distinct points z,, Z2, z3 in the complex plane are 


collinear if and only if their simple ratio (z: — z3)/(Z2 — z3) is real. 


Problem 3.28 Show that a quadruple of distinct points z), z2, 73, za € C lies either 
on a line or on a circle if and only if their cross ratio ((z1 — z3)(z2 — z4)) : 
((z1 — z4)(Z2 — 2s) is real. 


Fig. 3.4 The complex cat 


Problem 3.29 For the maps C* —> C* defined by z+ 2? and z+ z|, draw in the 
complex plane the images of (a) the line x + y = 2, (b) the Cartesian and polar 
grids, (c) the circle |z + i| = 1, (d) the cat in Fig. 3.4. 


Problem 3.30 Write ¢ € C for a primitive kth root of unity. Show that 

(a) [[izy(f’x — a) = (—1)*t! (kt — a) for alla € C 

(b) for every f € C[y], there exists h € C[x] such that 1 ee f(C"x) = h(x) and the 
roots of h are exactly the kth powers of the roots of f. 


Problem 3.31 Find a polynomial f € C[x] whose complex roots are exactly 


(a) the squares of the complex roots of x* + 2x? — x + 3, 
(b) the cubes of the complex roots of xt—x-1. 
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Problem 3.32 (Cyclotomic Polynomials) For the cyclotomic polynomials”! 
®,(x) = TIe@ — ¢), where € € C runs through the primitive nth roots of 
unity, show that 

(a) ®2,(x) = ®,(—x) for odd n, (b) x” — 1 = Tan ®,(x), 

(©) n(x) = Tan @e"/4 — 1, 

(d) ®,(x) = xP! +---+ x41 and Oi (x) = ®, (x) for prime p, 
(€) Opn (x) = Py (x?) /®,,(x) for all m € N and primes p + m, 

(£) ® 1. i) = Ppiprrn (xr ‘pat for distinct primes p;. 

Problem 3.33 (Topology of the Complex Plane) An open disk of radius ¢ centered 
at a point z € C is called an e-neighborhood of z in the complex plane C. The 
basic notions of calculus, such as the limit of a sequence of complex numbers and 
the limit of a function, the continuity of a function, and open and closed subsets in 
C, are defined in terms of ¢-neighborhoods in precisely the same way as those”® 
for R. Formulate and prove theorems about the limits of sums and products for 
convergent sequences of complex numbers. Show that limy—90(%,+ivn) = a+ib 
in C if and only if there exist both lim,—95 x, = aand lim,-+99 y, = bin R. Show 
that every bounded sequence of complex numbers has a convergent subsequence. 
Prove that for every continuous function f : C — R and closed, bounded subset 
Z CC, the restriction f|z : Z — R is bounded and achieves its maximal and 
minimal values at points of Z. 


Problem 3.34 (Algebraic Closure of C) Let f € C[x] be a polynomial of positive 
degree. Prove that the function |f| : C > R,z bh |f(2J, is continuous and 
bounded from below,” and that it achieves its minimal value on C at some point 
Zo € C. Then assume that f(zo) # 0 and expand /f as a polynomial in w = z— Zo: 
f(2) = fo) + amnw" + higher powers of w, where a,,w” is the nonzero term 
of least positive degree. Choose some } = 4%/—f(z)/am € C in order to have 
And” = —f (zo). Check that | f(z + t3)| < |f(zo)| for all real f small enough. 
Deduce from these observations that f has a root in C. 


Problem 3.35 Find all invertible elements in the following rings: 


(a) Zi]#£{a+bieCl|abeZ,? =-1}, 
(b) Zo] # {a+ bw €C | a,be Zw? +o+1=0}. 


?7See Sect. 3.5.4 on p. 60. 

8For example, a point p € C is the limit of a sequence if every e-neighborhood of p contains 
all but a finite number of elements of the sequence. A set U is open if for every z € U, some e- 
neighborhood of z is contained in U. A function f : C — C (or function g : C > R) is continuous 
if the preimages of all open sets are open. 


°Hint: First show that VM € R, IR > 0: | f(z)| > M as soon as |z| > R. 
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Problem 3.36 In the ring Z[i], write 5 € Z[i] as a product of two irreducible 
factors.°° 


Problem 3.37 Prove that the following properties of a prime number p € N are 
equivalent: 
(a) p 3 (mod4), (b) —1 is a square in F,,, (c) p is reducible in Z[i], 
(d) there exists a nonzero ring homomorphism Z[i] > F,, 
(e) p =x? + y’ for some x, y € Z. 
Problem 3.38 (Eisenstein’s Proof of Quadratic Reciprocity) Show that for every 


prime p € N, the Legendre—Jacobi symbol”! (4) is a multiplicative character” 


of n and evaluate ppl ( 1). Then compare the sign of (4) with the sign of the 


product 


TT sin(27mj/p) 
ra sin(2mj/p) ~ 


Then take m = q in this product, factorize each fraction as in Problem 3.24 on 
p. 68, and prove that for every prime g € N, 


4 
Problem 3.39 Evaluate fica : 
109 


3°Recall that a noninvertible element a in a commutative ring with unit is reducible if a = be for 
some noninvertible b, c; otherwise, a is irreducible. 


31See formula (3.22) on p. 65. 
32See Problem 2.18 on p. 39. 


Chapter 4 
Elementary Functions and Power Series 
Expansions 


In this chapter as in Chap. 3, we write K for an arbitrary commutative ring with unit 
and k for an arbitrary field. 


4.1 Rings of Fractions 


4.1.1 Localization 


The concept of a fraction,' which creates a field Q from the ring Z, is applicable 
in great generality. In this section, we formalize this notion for an arbitrary 
commutative ring K with unit. 

A subset S C K is called multiplicative if 1 € S,0 ¢ S, and st € S forall s,t¢ S. 
For example, if g € K is not nilpotent, then the set of all its nonnegative integer 
powers? gq‘ is clearly multiplicative. Another important example of a multiplicative 
subset is provided by the set of all elements in K that do not divide zero: 


K° 2 {ae K|ab=0 > b=0}. 


In particular, the nonzero elements of an integral domain form a multiplicative set. 

Given a multiplicative subset S C K, write ~s for the equivalence on K x S$ 
generated by all relations (a, t) ~ (as, ts), where s € S. The equivalence class of 
a pair (a, s) modulo ~s is called a fraction with denominator in S and is denoted 
by a/s. We write KS~' for the set of all such fractions and call it the localization 
of K in S or the ring of fractions with numerators in K and denominators in S. 
The appropriateness of the term “ring” is provided by Lemma 4.2 below. 


'See Example 1.5 on p.9 and Example 2.2 on p. 20. 


>By definition, we put g° # 1. 
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Lemma 4.1 The equality a/r = b/t in KS“ holds if and only if there exists s € S 
such that ats = brs in K. 


Proof Let us provisionally write (a,r) ~ (b, ft) if ats = brs for some s € S. This 
relation is contained in ~s, because it can be achieved in two steps by means of 
the relations generating ~s as follows: (a,r) ~ (ats, rts) = (brs,rts) ~ (b,t). 
Therefore, it remains to verify that ~ is an equivalence relation. Reflexivity and 
symmetry are evident. Let us check transitivity. If (a, r) ~ (b, t) and (b, t) © (c, uv), 
ie., there are s;,52 € S such that ats; = brs; and bus. = cts2, then au(ts,s2) = 
bruss2 = cr(ts1sz), that is, (a,r) ~ (c,u). oO 


Lemma 4.2. Addition and multiplication of fractions are well defined by the rules 


fo (4.1) 


rs 


’ 


@ Page OT Pr a b,, ab 
7 ros 


r AY rs 


and they provide KS“ with the structure of a commutative ring with unit element 
1/1 and zero element 0/1. 


Proof The consistency of the definitions (4.1) means that the results of the 
operations are unchanged after replacement of ¢ and z by © and pe respectively, 
where u, w € S. This is obvious: 


au bw ausw+bwru (as+br)-wu — as+br 
TU = sw rusw 7 rs- wu ors” 
au bw aubw_ (ab)-wu ab 
mu sw ruswors*wue rs 


The axioms of a commutative ring with unit are checked in the same straightforward 
way, and we leave this task to the reader. Oo 


Theorem 4.1 The map ts : K > KS~!, a+» a/1, is a ring homomorphism with 
kernel kerts = {a € K | ds € S: as = O}. Every element of ts(S) is invertible 
in KS~'. For every ring homomorphism @ : K — R such that g(1) = 1 and all 
elements of y(S) are invertible in R, there exists a unique ring homomorphism gz : 
KS~! — R such that g = Qs © ts. 


Proof It is clear that ts respects both ring operations. Given s € S, the inverse to 
ts(s) = s/1 is 1/s. The fraction ts(a) = a/1 equals 0/1 if and only if there exists 
s € S such thata-1-s = 0-1-s = 0. It remains to prove the last statement. There 
is just one way to extend g : K > R to aring homomorphism gs : KS~' —> R, 
because the equalities gs(1/s) - p(s) = gs(s- (1/s)) = gs(1) = 1 force us to put 
gs(1/s) = 1/¢9(s). Therefore, the required extension should be defined by 


ack 1 
gs(a/r) = g(a): rt 
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This definition is consistent, because the replacement of ¢ by © for s € S leads to 


(28) = 2 _ wlowe) _ oo 
° 1) eeG) oO)” 


rs 


We ask the reader to check that gs respects addition and multiplication. Oo 


Remark 4.1 (Universal Property of Localization) The last statement in Theo- 
rem 4.1 is known as the universal property of localization. The ring KS~! together 
with the homomorphism ts : K —> KS~! is uniquely determined by this property 
in the following sense. Let a homomorphism 1’ : K — F respect the unit element 
and take all elements of S to invertible elements in F. If for a ring homomorphism 
gy : K — R such that g(1) = 1 and all elements of g(S) are invertible in R there 
exists a unique ring homomorphism g’ : F — R such that gy = g’ ov’, then there 
exists a unique isomorphism of rings w : KS~! = F such that v’ = y eu. Indeed, by 
Theorem 4.1, the homomorphism 0’ is uniquely factorized as u’ = yw ets. As soon 
as U’ also possesses the universal property, the homomorphism ts is also uniquely 
factorized as ts = w’ ov’. The composition y’ © w provides ts itself with the 
factorization ts = w’ ° wo ts. Since ts = Idgs—i © ts as well, the uniqueness of 
such a factorization forces w’ © yy = Idgs—1. For the same reason, yo w’ = Idr. 
Thus w’ and yw are ring isomorphisms that are inverse to each other. 


Remark 4.2 (Multiplicative Sets Containing Zero) If we remove the condition 0 ¢ S 
from the definition of multiplicative set, then everything said before maintains 
its formal sense. The equivalence ~s and the ring KS~! remain well defined; 
Lemma 4.1, Lemma 4.2, and Theorem 4.1 together with their proofs are still true as 
well. However, if 0 € S, then KS~! becomes the zero ring, because of 


a/s = (a-0)/(s-0) = 0/0 = (0-1)/(0- 1) = 0/1 


for every fraction a/s. 


4.1.2 Field of Fractions of an Integral Domain 


If K has no zero divisors, then all nonzero elements of K form a multiplicative 
system. The localization of K in this system is a field. It is called the field of 
fractions of K and is denoted by Qx. The homomorphism: : K > Qx, ab a/1, is 
injective in this case. The universal property of localization says that for every ring 
homomorphism g : K — R such that g(1) = 1 and such that for every a 4 0, g(a) 
is invertible in R, there exists a unique injection @ : Ox — R coinciding with gy on 
K C Ox. 


76 4 Elementary Functions and Power Series Expansions 


Example 4.1 (The Field Q Revisited) The field of fractions of Z is the field of 
rational numbers Qz = Q. It is canonically embedded as the prime subfield into 
any field of zero characteristic.° 


Example 4.2 (Laurent Series) For a field k, the ring of formal power series k[x] 
is an integral domain. Its field of fractions Qyj,] is described as follows. Every 
power series q(x) € kjx] can be uniquely written as x’"qrea(x), where m equals 
the degree of the lowest term in q, and grea € kx] has nonzero constant term. Since 
every power series with nonzero constant term is invertible in k[x], every fraction 
f = p/q can be uniquely written as f = x""p/qrea = xh, where h € k[x]. We 
conclude that the field of fractions Qxj,j coincides with the localization of kx] in 
the multiplicative system of powers x”. The latter consists of formal power series 
with integer exponents bounded from below: 


fx = ~ Ge = ag bots Bax! +a +a ex + =. (4.2) 
k>—m 


It is denoted by k((x)) and is called the field of Laurent series. Thus, Qij.j ~ k(x). 


4.2 Field of Rational Functions 


4.2.1 Simplified Fractions 


The field of fractions of the polynomial ring k[x] is denoted by k(x) and is called 
the field of rational functions in one variable. The elements of k[x] are the fractions 


f(x) = p(x)/q(x) , where p,q € k[x], ¢ 4 0. (4.3) 


Every fraction admits many different representations (4.3) as a ratio of two 
polynomials. Among these, there is a minimal one, called simplified, which has 
a monic denominator that is coprime to the numerator. It is obtained from (4.3) by 
cancellation of GCD(p, q) and the leading coefficient of g. 


Exercise 4.1 Show that two fractions are equal in k(x) if and only if their simplified 


representations coincide. 


Proposition 4.1 Assume that the denominator of the simplified representation f /g 
is factorized as g = 81 + 82 +++ 8m where GCD(gig;) = | for alli ¥ j and all g; are 
monic. Then the fraction f /g is uniquely expanded in k(x) as a sum of simplified 


3See Sect. 2.8 on p. 35. 


4.2 Field of Rational Functions 77 


fractions 


ey ee ee eee (4.4) 
&§ &1 §2 8m 


where degh = deg f — deg g and deg fj < deg g; for all i. 


Proof Write G; = g/g, for the product of all g, except g;. Then (4.4) is 
equivalent to 


f=he+fiGi + foGo +--+ +finGn, 


where deg(f|G, +/2G2+---+fmGmn) < deg g, because of the above assumptions on 
the degrees. This means that h is the quotient of the division of f by g, fiGi +/2G2+ 
-+ + fnGm is the remainder of this division, and each f; is the unique polynomial of 
degree deg f; < deg g; that represents the residue class [f]-[Gj]~! in k[x]/(g;). Thus, 

all ingredients of (4.4) exist and are uniquely determined by f and gj, g2,..., 2p. 
oO 


Proposition 4.2 Every simplified fraction f/g", where degf < deg(g”) and g is 
monic, is uniquely expanded in k(x) as a sum of simplified fractions 


ie ie eee oe (4.5) 


gm 88 
where deg f; < deg g for all i. 


Proof The expansion (4.5) is equivalent to the expansion 


a =fie + fg” +t + fn—18 + fin» (4.6) 


which is nothing but the representation of f in g-adic notation, where f,, is the 
remainder on division of f by g, f,-1 1s the remainder on division of the quotient 
(f—fm)/g by g,fin—2 is the remainder on division of the quotient ((f—fin)/g—fin—1) : 
g by g, etc. 


4.2.2 Partial Fraction Expansion 


The two previous lemmas imply that every simplified fraction f/g € k(x) can be 
uniquely written as a sum of a polynomial of degree deg f — deg g and a number of 
simplified fractions p/q'", where deg p < deg gq, g runs through the monic irreducible 
divisors of g, and m € N varies between | and the multiplicity of q in the irreducible 
factorization of g. This sum is called the partial fraction expansion of f/g. It can be 
helpful in practical computations. 
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Example 4.3 Let us compute the antiderivative* and 201 6th derivative of 1/(1 +x). 
The partial fraction expansion of 1/(1 + x?) in C(x) looks like 


1 a B 
—_ = + —,wherea, BEC. 
14+ x? l+i l-ix B 


Substituting x = +i in the equality 1 = a(1 — ix) + B(1 + ix), we conclude that 


a = B =1/2, ie, 
1 il 1 7 1 
1¢2° J9\ ite ” THe)" 


Now we can easily compute the 201 6th derivative, 


d 2016 1 1 d 2016 dg 1 d 2016 Ca 


2015! 
= (1 aie 7) naa 4 ( = re ia 
_ 2015! ; ad - ix)2016 +(1+ ix)?016 
_ (+ x2)2016 2 


= 2015! 2016) , 2016) 4 2016) « 
= pie’ 1- ) xo + 4 x 6 6 a 


(2°) a a ; 
2 
as well as the antiderivative, 


[< -;/ dx ae 
14x22 2) lt 2/) 1l-ix 


1 
5 (lost + ix) + log(1 — ix)) = log V1 +22. 


II 


All these equalities can in fact be treated purely algebraically within the ring k[x]. 
In the next sections, we shall explain explicitly what this means. 


“Tt will be defined explicitly in Sect. 4.3 on p. 82. 
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4.2.3 Power Series Expansions of Rational Functions 


By the universal property of localization, the tautological inclusion k[x] C k(x) 
can be uniquely extended to the inclusion of fields 


k(x) > k@), 


which is the identity on the subring of polynomials. From a practical viewpoint, this 
means that each rational function f/g can be uniquely expanded as a Laurent power 
series. Such an expansion can be made quite explicit as soon as the denominator g 
is completely factorized into linear binomials.” Let deg f < deg g and 


g(x) = Lt ax t+ anx® + e+ + ayx" = [a — ax)", (4.7) 


where all a; € k are distinct and a, 4 0. 


Exercise 4.2 Check that the numbers q; on the right-hand side of (4.7) are roots of 
the polynomial 


x) = g/t) =f tal! + + ay-it t+ an = [ [e- ari)". (4.8) 


Then the partial fraction expansion of f/g consists of fractions 


Bi 
: 4.9 
ad = onjx) ‘i ( ) 
where By € k and 1 < ky < m; for each i. When all the multiplicities m; are equal 
to 1, the partial fraction expansion takes the especially simple form 


f(x) _ By Bo . Bn 
(=i-aoei-es “ies ae es 


The constants £; can be found via multiplication of both sides by the common 
denominator and substituting x = a7!. This leads to 


f(a") ot |f (a; ") 
=. 4.11 
: TheiG — (a, /ai)) Vaz (Q; = ay) ( ) 


Such a factorization is always possible (at least in theory) over an algebraically closed field k (see 
Sect. 3.4.3 on p. 55). 
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Exercise 4.3, Check that 1/(1—ax) = 1+ax+0°x? +++» =), ) a4 inky. 


Therefore, if all m; are equal to 1, then the fraction f/g is the sum of geometric 
progressions (4.10): 


F@)/g(x) = D> (Biol + Boos + +++ + Bnotk) 2%, 


k20 


where the 6; are provided by formula (4.11). If there are some B/(1 — ax)” with 
m > 1 among the partial fractions (4.9), then they are expanded in power series by 
Newton’s binomial formula with negative integer exponent: 


> 


1 I ae 7 


d—-x»" (m—1)! m—1 


k20 k20 


(4.12) 


obtained by (m — 1)-fold differentiation of both sides in (I — x)~! = 1 +x+27 4+ 
BH xt tere, 


Exercise 4.4 Check that (4)" (1 —x)~! = m!/(1—x)"*!. 


Thus, a partial fraction with multiple denominator is expanded as 


B 7 p{k+m-1 


k20 


4.2.4 Linear Recurrence Relations 


A linear recurrence of order n on a sequence z € k of unknowns is a relation 
Zk + AZ + A2Z—2 + +++ + GnZ—n = 0, (4.14) 


where a}, a2,...,d, € K are given constants, and the first n terms Zo, Z),..., Zn—1 
of the sequence are also known. To solve equation (4.14) means to write z, as an 
explicit function of k. This can be done as follows. Consider the generating power 
series of the sequence 


2(x) #29 + xt or +2 = Do gat ef]. 
K0 
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Relation (4.14) says that the product z(x) - (1 +ayxt ax +e + Anx*) is a 
polynomial of degree at most n — 1. Hence, z(x) is a power series expansion for 
the rational function 


bo + bx tees + b,x! 


——— 4.15 

L + ayx t+ agx? + +++ + an® oe 

The coefficients bo,b1,...,b,-1 are uniquely determined by the initial terms 
20, Z1,+-+,Zn—1 by means of the equality 

(zo + zyxteee + Zn—1x" ') . (1 + ayxt ax +--+ oe) (4.16) 


= by + bx +++» + b,x"! + higher-degree terms. 


Therefore, to write z, as an explicit function of k, we have to compute 
bo, b1,...,bn—1 from (4.16) and expand the fraction (4.15) into a power series. 


Example 4.4 (Fibonacci Numbers) Let us find the kth element of the Fibonacci 
sequence z;, defined recursively as 


w=0, a=l, REHM tmwsr for k>2. 


This sequence satisfies a linear recurrence equation of second order: z, — z%—1 — 
Z-2 = 0. The equality (4.16) becomes x(1 — x — x*) = by + b} + +++ and gives 
bo = 0, b} = 1. Thus, z, equals the kth coefficient in the power series expansion of 
the rational function 


sre B+ B- 
l-x-x? l—-ayx 1-—a_x’ 


where ws. = (1 + /5)/2 are the roots of the polynomial® 1? — t— 1, and the B. are 
given by (4.11): 


1 1 
B+ =aza5'/(a4-a ar and B_ =a_a_'/(a =O) a? 


Hence, 


x 1 1 1 ak — ok 
ee = - jy \ + —. 
l-x—-—x? (<= ——) pS J5 


6See formula (4.8) on p. 79. 
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and 


(1 i: v3) - (1- V3) 
> ——acoe - 


gk = 


Exercise 4.5 Can you show that z, is a positive integer? 


Proposition 4.3 Let the polynomial (4.8) of the linear order-n recurrence equation 
Zk + Ap Ze—1 + nZ—2 + +++ + nZ—n = 0, where a; € k, be completely factorized in 
kd asf + ay! +--+ + ait + an = TTja, (t— a)", where 0), 02,...,0, €k 
are distinct. Then every sequence zx satisfying this equation has the form z%, = 
ak. p(k) + ok - go(k) + «++ + ak - p(k), where each y;(x) € k[a] is a polynomial 
of degree at most m; — 1. 


Proof The generating series )*zx* € k[x] of any solution is a power series 


expansion for a sum of partial fractions of the form 6 - (1 — ax)~”, where a is a 
root of the polynomial (4.8), the integer m is in the range 1 < m < mj, and B € k is 
some constant uniquely determined by a, m, n, and the initial terms of the sequence 
zx. By formula (4.13) on p. 80, such a fraction is expanded as } 7,50 B- atk « p(k) -x*, 
where y(k) = ney is a polynomial of degree m— 1 in k. Oo 


m—1 

Remark 4.3 The polynomial y(t) = fat” | +--++ay—1t+dy is called the char- 
acteristic polynomial of the recurrence relation Zz, +41 Z—1 +d2Zh-2 + +++ tnZk—n = 
0. As soon as a complete linear factorization of y(t) is known, Proposition 4.3 allows 
us to solve the recurrence relation via the method of undetermined coefficients: 
consider the coefficients of polynomials g, as unknowns and determine them from 
the initial conditions z; = a! - g (i) + a): g(i) + +++ +ak-g,(),0 <i<n-1, 
which form a system of n ordinary linear equations in n unknown coefficients. 


4.3 Logarithm and Exponential 


Throughout this section, we assume that k is a field of characteristic zero. In this 
case, formula (3.7) on p. 45 for the derivative, 


(ao + ax + ax eee) =a, +2axt+3ayx+--= ka, (4.17) 


k21 


implies that for every power series f(x) = ay + a)x+ anx? +--+ € kx], there exists 
a unique power series F € k]x] without constant term such that F’ = f. This series 
is called the antiderivative of f and is denoted by 


a2 3 


[roves f ayy + Sea fees eae (4.18) 


k21 
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4.3.1 The Logarithm 


The antiderivative of the alternating-sign geometric progression is called the 
logarithm and is denoted by 


d. 
log +n f = f ax 2-8 to) a (4.19) 
Xx 
2 3 4 5 —1)*-1 
po ee eas a 


2 3 4 5 k 
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Write N Cc k[x] for the additive abelian group of all power series without constant 
term and U C kx] for the multiplicative abelian group of all power series with 
constant term 1. We can replace | + x in log(1 +x) with any power series u(x) € U. 
This is an algebraic operation, because it means substituting u(x) — 1 for x, and 
u(x) — 1 has no constant term. Therefore, taking the logarithm produces a well- 
defined map 


log:U>N, ub logu. (4.20) 


Exercise 4.6 (Logarithmic Derivative) Verify that 4 logu = u'/uforallu € U. 


Lemma 4.3 For all u,w € U, the equalities u = w, uw’ = w’, log(u) = log(w), 
u' /u = w' /w are equivalent. 


Proof The first equality implies all the others. For two power series u, w with 
equal constant terms, the first two equalities are equivalent by the differentiation 
formula (4.17). Replacing u, w by logu, logw, we get the equivalence of the last 
two equalities. It remains to deduce the first equality from the last. The last equality 
forces 0 = u'/u — w'/w = (u'w — w'u)/uw = (w/u)- (u/w)’. Thus, (u/w)’ = 0, 
that is, u/w = const = 1. Oo 


Exercise 4.7 Show that log(1/u) = —logu forall u € U. 


4.3.2 The Exponential 


There exists a unique power series f € U such that f’ = f. It is called the exponential 
and denoted by 


xk x2 x xt x 
x def —_=] ae ee eee ee ee 4.21 
e a aera aaa (4.21) 
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We can replace x in e* by any power series T(x) € N. This leads to the power series 
e™™) with constant term 1. Therefore, the exponential produces a well-defined map, 


exp:N>U, tre’. (4.22) 


Theorem 4.2. The maps (4.22) and (4.20) taking the exponential and logarithm of 
power series 


loguqiu (4.23) 


are isomorphisms of abelian groups each the inverse of the other. That is, the 
equalities 


loge’@ = 1, e%”“=u, log(ujuz) = log(u;) + log(uz), ett? = ee”, 


hold for all u,u,,uz € ;U and all t,t], 1 € N. 


Proof Differentiation of both sides immediately verifies the equality loge’ = tT. 
After that, taking the logarithms of both sides verifies the equality e'8" = u. 
Therefore, the maps (4.23) are bijections that are inverses to each other. The power 
series log(u;u2) and log uw; + log uz coincide, because they have equal constant terms 
and equal derivatives: 


/ / / if / 
(uy U2) Uy Uy + UU, uy Us 1 
SS YH Kr (log u; + log ur) ‘ 
uj, u2 uj, U2 uy u2 


(log(wiu2))' = 


Hence, log is a homomorphism. This forces the inverse map to be a homomorphism 
as well. Oo 


Exercise 4.8 Prove that e*t” = e*e” in k[x, y] by straightforward comparison of 
the coefficients of like monomials on both sides. 


4.3.3, Power Function and Binomial Formula 
For a € k, the binomial series with exponent a is defined as 
(1 ae Pig def et log +x) 


In this formula, 1 + x can be replaced by any power series u € U. Thus, for every 
a € k there is an algebraic operation U > U,u tb u®, called the power function 
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with exponent a. It satisfies the expected list of properties: for all u, v € U and all 
a, B € k, we have 


ue - ub = et logu . ef logu = et log w+ B log u = et) logu = yet (4.24) 
(ua)? = efloee) = gPloe(er™*") = gab logu — 08 (4.25) 
(uv)” = et lost») _ et log ut log v) = et log uta log v = et logu 7 et logy = yy". 

(4.26) 


In particular, u!/” = 2/u for every u € U in the sense that (u'/")" =u. 
To find the coefficients of the binomial (1 +x) = 1+a,x+ax?+4 --- explicitly, 
let us take the logarithmic derivative of both sides. This gives 


a ay + 2aox + 3a3x* + ++: 


1l+x Lt ajx t+ anx?+-:: 


Thus, w- (1 + ayx + aox* + ---) = (14 x)- (ay + 2anx + 33x” + ---). Therefore, 
a, = @ and wag_, = kay + (k — 1)ay_, for k = 2. This leads to 


_a-(kK-1)_ @=Kk=-I)@-(k-2) | 

es es a peat ay 
_ @=k=)))(@- K=2) += Na 
k} , 


Both the numerator and denominator of the last fraction consist of k factors 
decreasing by 1 from k to | in the denominator and from @ to a — k + 1 in the 
numerator. This fraction is denoted by 


ee (4.27) 


OY cer a(a—1)---(a—k+ 1) 
k! 


and called the binomial coefficient of the exponent a € k. We make the following 
claim. 


Proposition 4.4 (Newton’s Binomial Formula) For every a € k, there exists a 
formal power series expansion 


ater= P(t} (ar SOD py HO DERD 9s. 
ae (4.28) 
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Example 4.5 (Binomial with Rational Exponent) If the exponent a is equal to n € 
N, then the numerator (4.27) acquires zero factors for all k > n, and the binomial 
expansion (4.28) becomes finite: 


2 
k=0 


—] n 
iepioteagel” Pies, wed (fa 


This agrees with formula (1.7) on p.6. If a = —m is a negative integer, then the 
expansion (4.28) turns into the one obtained in formula (4.12) on p. 80: 


i m(m + 1) a m(m + 1)(m+ 2) < 


+x)" =1- 3 6 
k+m-—1 
= cor( )-#. 
dX, k 


For a = 1/n,n €N, the binomial formula (4.28) expands the radical function 


i... Sai 1(1_ 4) (1-2 
Fie pt Viens VG en 
n 2 6 
x n-l ¥ n—1)(2n-1) x 
n 2 nv 2-3 we 
(n—1)(Qn—1)Gn—1) x! ae 
2-3-4 nt 


For example, when n = 2, the coefficient of x* equals 


1-3-5 +++ (2k—3) — (-DE! (2k)! 
DiAeGeQh)  Bh=1 O-As6 =O) 


pe 2k 
~ Qk—1)-4e Vk] 


Thus, 


(—1)k! (z) xt 
Jira = ; Pan (4.29) 
dX 2-1 \k) 4 


Example 4.6 (Catalan Numbers) Let us use the square root expansion (4.29) 
to deduce an explicit formula for the Catalan numbers, which appear in many 


combinatorial problems. The product of (n + 1) quantities 


gaz +++ An (an n-fold product) (4.30) 
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can be computed in n steps by executing one multiplication per step. If in each step, 
we enclose the items to be multiplied in parentheses, then the entire computation 
will be encoded by n pairs of parentheses inserted into (4.30). For n = 1, there is 
just one arrangement of parentheses: (aa); for n = 2, there are two: (a1 (a2a3)) 
and ((a,a2)a3) ; for n = 3, there are five: 


(a; (a2(a3a4)) , (41 ((a2a3)a4)) , ((a142)(a3a4)) , (a1 (G243))a4) , (((a1a2)a3)a4) . 


We see that not all n! combinations of n sequential multiplications appear in the 
step-by-step evaluation of (4.30). The total number of admissible distributions of n 
pairs of parentheses in (4.30) provided by all evaluations of the product is called the 
nth Catalan number and is denoted by c,. It is also convenient to put co “1. The 
next few values are c) = 1, cz = 2,¢3 =5. 

For n = 2, the set of all admissible arrangements of n pairs of parentheses splits 
into n disjoint classes in accordance with the position of the penultimate pair of 
parentheses: 


(ao (a2 aoe an)) ; ((aoai)(a2 sa an)) : ((agayar) (a3 “ok an)) ‘ ((ao ...d3) (a4... an)) ; 
ee ((ao aa An—3)(An—24n—14n)) , ((ao a An—2)(An—14n)) ; ((ao er n—1)Gn) . 


The classes consist Of Cy—1, C1Cy—2, C2Cn—3, C3Cn—4, -++, Cn—2C1, Cy—1 elements 
respectively. Therefore the Catalan numbers satisfy the recurrence 


Cy = COCn—1 + C1Cn—2 + +++ + Cy—2€1 + Cn—10€0, 


which says that the Catalan power series 


c(x) = yar =1taxtor+oaxr+--- € Zp] 
k20 


satisfies the relation c(x)? = (c(x) — 1)/x. In other words, t = c(x) is a root of the 
quadratic polynomial x - ? — t — 1 = 0 in f with coefficients in the field of Laurent 
series Q((x)). By the famous quadratic formula, the two roots of this polynomial are 


1+ JV1—4x 


ae (4.31) 


Since 


1 2k 
VmR=-D 5. Jota tae 994? ont 


k 
k20 


and we are looking for the root living in the subring Z[x] Cc Q(x)), we have to 
choose the minus sign in (4.31). Thus, c(x) = (1— 1 — 4x)/(2x), and we conclude 
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from (4.29) that 
11 2k+2\ 1 (2k 
ee 2° eG Vee] k41 Vk} 


Exercise 4.9 Can you show that c; is an integer? 


4.4 Todd’s Series and Bernoulli Numbers 


4.4.1 Action of Q|d/dé] on Q{A] 


Consider the ring of formal power series Q]x] in the variable x and the ring of 
polynomials Q[f] in the variable t, and write 


D=<: QOH, sre, 


for the differentiation operator acting on QJ[f]. We can substitute D for x in any 
formal power series ®(x) = 7,5 9x" € Ql] and treat the result as a map 


O(D): Qi] > Ql], fr g-ftor-f tof’ += oo DY. (4.32) 
k20 


Since each differentiation decreases the degree of a polynomial by 1, all summands 
with k > degf vanish. Therefore, for every f € Qf], the right-hand side of (4.32) is 
a well-defined polynomial of degree equal to at most deg f. We write ®(D)f for this 
polynomial. Note that its coefficients are polynomials in the coefficients of f and the 
first deg(f) coefficients of ®. Moreover, ®(D)f is linear homogeneous in f in the 
sense that 


Va,BEeQvf.ge Qld, PD)(@-f+B-s)=a-ODf+B-PD)g, (4.33) 


because the derivation operator D and all its iterations D‘ are linear homogeneous. 
Note also that after the substitution x = D in the product of power series 
@(x) V(x) € Qa], we get the composition of maps ®(D) ° W(D). 

Exercise 4.10 Check all these claims. 

Hence, all maps ®(D) commute with each other. Any two maps ®(D), ®~!(D) 
resulting from inverse elements ®, ©"! = 1/® of Q|x] are mutually inverse 


bijective maps. Therefore, the operators ®(D), ® € QJx], form an abelian 
transformation group of Q[] in the sense of Example 1.7 on p. 13. 
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The linearity of ®(D)f in f allows us to compute ®(D)f for every f as soon as 
the sequence of polynomials ©,,,(t) = ®(D)r” is known. For 


f(t) = aot" + ay) +++! + dy-it t+ an, 
we get 
®(D) (ap + ayt +--+ + ant") = ag + a, Dy (t) + an ®2(t) + +--+ +a, ®, (0). 


The polynomial ®,, € Q[#] is called the mth Appell polynomial of the power series 
® € QJ]. It depends only on the first m + 1 coefficients of ® and has degree at 
most m. 


Example 4.7 (Shift Operators) The Appell polynomials of exponent e* = 1+x-+ 
Petit... are 


1 m(m—1)---(m—k+1) 
=) =) m—k 


Hence, the operator e? acts on Qf] as the shift of variable e? : f(t) + f(t+ 1). 
Since the series e~ is inverse to e* in Q|x], the operator e~? acts as the inverse shift 


e?P : fi) f(t—1). 
Exercise 4.11 Check that e?? : f(t) b f(t+q@) foralla €Q. 


Example 4.8 (Power Sums of Integers) We are looking for polynomials S,,(t) € 
QJt], numbered by integers m > 0, such that 


Sm(n) =O" +1" $24 3"4 0. +n™= SOR” (4.34) 
k=0 

for all integers n > O. For example, form = 0,1,2,3, we have the well-known 
formulas’ 

Sov) =14+14+14+---+1l=n, 

Sim) =14+24+34+---+n=n(n+1)/2, 

So(n) = 174.2743? +.-- +r? =n(n4 1)Qn4+1)/6, 

§3(n) = 24274394 --- +H =Vn4+ 17°/4=S1(n)’, 


(4.35) 


Do not worry if you are not conversant with some of them. We will deduce them all soon. 
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which mean that So(t) = t, S,(t) = t(t+ 1)/2, So(t) = t(t + 1)(2t+ 1)/6, $3 = Si. 
To analyze the general case, let us consider the difference operator V = 1 — e~ 
g(t) & g(t) — g(t — 1). If a required polynomial S,,(¢) exists, then 


VSm(t) = t, (4.36) 


because of S,,(1) —S,(n—1) = n for alln € N. Conversely, if we find S,,(t) € Q[¢ 
that solves equation (4.36) and has S,,(0) = 0, then equality (4.34) is automatically 
satisfied, because 


Smn(n) = 1" +S) (n—1) = n+ (n—-1)" + Sy) (n—2) = or = 4+ (n- 1)" +--+ $1" 40". 


Thus, we have to solve the equation (4.36) in S,, € Q[¢]. If V were invertible, we 
could do this at once by applying V~! to both sides. However, the power series 
1 — e™ is not ene in Q[x], because it has zero constant term. To avoid this 
problem, write it as - x, where the first factor gets the unit constant term and 
becomes invertible. The i inverse series to the first factor, 


td(x ae € Qh. 


is called Todd’s series. Substituting x = D in the equality td(x)- (1 —e*) = x leads 
to td(D) ° V = D. Thus, if we apply td(D) to both sides of (4.36), we get 


DS (t) = td(D)t” . 


In other words, the derivative S’, 


(t) is equal to the mth Appell polynomial td,,() 
of Todd’s series. Since S,, has zero constant term, S,,(t) = td,,(t) dt. To make 


this answer more precise, let us write Todd’s series in “exponential form,” that is, 
introduce a; € Q such that 


td(x) = >> oe (4.37) 
k>0 
Then 
m m m —k+1 
= ak kam _ m mak _ m ae 
suo = f(D Hor) a= f(N(arsya=¥ (Poe. 
k=0 k=0 k=0 
So, 
1 
Sin(t) ee 
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This formula is often symbolically represented as 
(m + 1) : Sin(t) _ (al + ea — am+1; 
where the arrow in a | indicates that a‘ is to be replaced by a, in the expansion of 


the binomial (a + #)"*!. The coefficients a, of Todd’s series (4.37) can be found 
one by one from the relation 


a2 > 43 5. 44 4 1 1, 1, 1 , 
1+ Gaia — — -).(I-5 a —— —--)=1, 
( ra eg oa Y g°" 6" oa” io 
which says that td(x) - (1 — e~*)/x = 1. For example, a; = $,a) = ¢,a3 = 0, 


a4 = -y. and 


1 3 
382(t) = 3at+3ae°+r = got ae ee ae 
483(t) = 4a3t + 6aof +4a,Pt+A=P+2°4+7, 
in agreement with (4.35). 


Exercise 4.12 Compute the first dozen of the a;, continue the list (4.35) up to 
Sio(n), and evaluate® S19(1000). 


4.4.2 Bernoulli Numbers 


Todd’s series got its name in the middle of the twentieth century when it appeared 
in algebraic topology. In the seventeenth and eighteenth centuries, Jacob Bernoulli 
and Leonhard Euler, who were the first developers of the subject, preferred to use 
the power series 


xX _ By 
ce TH 7 
e—1 = k! 


td(—x) = 


whose coefficients B,; came to be called the Bernoulli numbers. Since we have 


x 2-—e*—e* 
+ —- = 


X 
1) Oe ie eae 


Xs (4.38) 


8Jacob Bernoulli (1654-1705) did this job in about seven minutes having just pen and paper, as 
he wrote (not without some pride) in his Ars Conjectandi, published posthumously in 1713 (see 
[Be]). 
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all but the first Bernoulli numbers coincide with the coefficients a, used above, 
ie., Be = a, for all k #~ 1, whereas B} = —a, = —4. Moreover, it follows 
from (4.38) that the odd part of Todd’s series is exhausted by the linear term. This 
forces By,4; = 0 for all k => 1. An extensive literature is devoted to the Bernoulli 
numbers? B>;. However, despite many beautiful theorems about them, not much is 
known about the explicit dependence of Bx, on k. 


Exercise 4.13 Prove the recurrence relation (n + 1)B, = — a i) - By. 


4.5 Fractional Power Series 


4.5.1 Puiseux Series 


A Laurent series in the variable x'/4, that is, a formal sum of the type 


f=) ae", aek,keZ, 


k>m 


is called a Puiseux series in the variable x with coefficients in k. In other words, 
a Puiseux series is a fractional power series whose exponents are bounded from 
below and admit a common denominator. 


Exercise 4.14 Convince yourself that the Puiseux series form a field. 


Theorem 4.3 For an algebraically closed field k of zero characteristic, the field of 
Puiseux series in x with coefficients in k is algebraically closed too. 


Less formally, Theorem 4.3 says that the roots y;, y2,...,Y, of a polynomial 
an(x)y" + An—1(x)y" | ees Sr ay (x)y + a(x) 


whose coefficients are Puiseux series in x can be expanded as a Puiseux series in 
x. In particular, given a polynomial f(x,y) € k[x, y], the equation f(x,y) = 0, 
considered as a polynomial equation in y with coefficients in k[x], can be completely 
solved in Puiseux series y(x). In other words, the implicit algebraic functions over 
an algebraically closed field k are exhausted by the Puiseux series. We give two 
proofs of Theorem 4.3. The first, short and conceptual, goes back to van der Waerden 
and Hensel. The second, which allows us to expand implicit algebraic functions in 
Puiseux series effectively, was the original discovery of Newton. 


°To begin with, I recommend Chapter 15 of the book A Classical Introduction to Modern 
Number Theory, by K. Ireland and M. Rosen [IR] and Section V.8 in the book Number Theory 
by Z. I. Borevich and I. R. Shafarevich [BS]. At http://www.bernoulli.org/ you may find a fast 
computer program that evaluates Bo, as rational simplified fractions. 
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Lemma 4.4 (Hensel’s Lemma) Let G(t,x) € k[¢][x] be a monic polynomial in the 
variable x with coefficients in k|t], where k is an arbitrary field. Assume that for 
t = 0, the polynomial G(0, x) € k[x] is a product of two coprime monic polynomials 
of positive degree: G(0,x) = a(x) - b(x). Then in k[t][x], the polynomial G(t, x) 
is also a product of two coprime monic polynomials, G(t,x) = A(t,x) - B(t,x), 
such that the degrees of A, B in x are equal to those of a, b, and A(O,x) = a(x), 
B(0, x) = D(a). 


Proof Write G(t, x), A(t, x), and B(t, x) as power series in ¢ with coefficients in k[x] : 


Gi,x) = go) +a@t+owrt-, 
A(t,x) =ao(xs) ta()t+ta~et+-, 
B(t,x) = bo) +h @tt+thwoPr+-:. 


Comparing coefficients of rt in the equality G(t,x) = A(t,x) - B(t,x) leads to a 
system of equations 


ao(x) bo(x) = go(x) (fork = 0), 
k-1 
ao (x) by (x) + bo(x) a(x) = g(x) — > aj(x) be-i(x) (fork > 1). (4.39) 


i=1 


We are given coprime monic polynomials ao(x) = a(x), bo(x) = b(x) satisfying the 
first equation. The polynomials a;, by of degrees deg a, < dega and deg hy, < degb 
are uniquely determined by equation (4.39) as soon as we know all the previous 
a;, b; and know that dega; < dega and degb; < degb for all i < k. Indeed, since 
G(t,x) is monic as a polynomial in x, for all k we have the inequalities deg g, < 
deg go. Therefore, the degree of the right-hand side in (4.39) is strictly less than 
deg ao - deg bo. Thus, bx is the unique polynomial of degree less than deg by whose 
residue modulo by represents the quotient of the right-hand side by aj modulo!? 
by. The residue of a, modulo ag plays a similar role. Hence, A and B exist and 
have the required degrees. To verify that A and B are coprime in k[f][x], we have 
to construct two series of polynomials p;, g; € k[x] of degrees bounded above such 
that the power series P(t, x) = po(x) + pi) t+ pox)? + +++, O(t,x) = qo(x) + 
qi(x)t+ qo(x) ? + «++ satisfy the equality AP + BQ = 1. Comparing coefficients 
of t* leads, as above, to a system of equations 


a. po+boqo = 1 (fork = 0), 


k-1 
ape + bode = — Gri + bigx-i) (fork = 1). 


i=1 


‘OCompare with the proof of Proposition 4.1 on p. 76. 
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Since ag = a and bo = D are coprime and the inequalities deg a; < deg ag , deg b; < 
deg bo hold for all i > 0, this system uniquely determines polynomials p;, g; of 
degrees less than deg ao, deg bo respectively. oO 


Lemma 4.5 Let k be an algebraically closed field of zero characteristic. Then for 
every polynomial 


F(t, x) = an(t) x" + an—1(t) x"! +++» + ao(x) € kM) [, 


there exist m € N and a Laurent series 3(t) € k(t) such that F (t", 0(t)) = 0 in 
k(). In other words, each polynomial with coefficients in k((t)) acquires a root in 
k(t!/"") for appropriate m € N. 


Proof Without loss of generality, we may assume that the coefficients of F lie in 
k[¢], that the leading coefficient is a, = 1, and that the next coefficient is a,_; = 0. 
The first is achieved via multiplication of F by an appropriate power of t, the second 
by multiplication of F by a”~! and renaming a,x by x, and the third by changing the 
variable!! x to x — ay_1/n. 


Exercise 4.15 For each of the three manipulations, verify that if the lemma holds 
for the modified polynomial, then it holds for the original one. 


If F is factorized in k[¢][x] as F = GH, where 0 < degG < degF, we can 
apply induction on deg F and find m and % for G instead of F. Now assume that F 
is irreducible. Then by Hensel’s lemma, the polynomial F(0,x) € k[x] cannot be 
decomposed in k[x] into a product of two coprime polynomials of positive degree. 
Over the algebraically closed field k, this forces F(0, x) to have a root of multiplicity 
deg F. Therefore, F(0,x) = (x — a)" for some a € k. If a ¥ 0, the polynomial 
(x — a)” contains a nonzero term!” nox"! of degree (n — 1). This contradicts our 
assumption on F. Hence, F(0,x) = x”. This means that every series a,,(t) has a 
zero constant term. If each of the a; is the zero series, then F(t, x) = x” has a root 
v(t) = 0. Assume that not all the a;(t) are zero. We claim that for appropriate 
p.q €N, the substitution t 4 17, x 4 fx, transforms the polynomial F into the 
polynomial r” - H, where H, as a polynomial in x, is monic and has no term of 
degree n — 1, and at least one coefficient of H is a series with nonzero constant term. 
This forces H to be reducible, and we will be able to apply induction. 

The first two constraints on H are automatically satisfied under the substitution 
taf, x 4 fx. With respect to the third, write each nonzero coefficient a,, in F as 


An(t) = ay, t’" + terms of higher degree in tr, 
where @,,,, 4 0 is the lowest nonzero coefficient. We take gq equal to the common 


denominator of all fractions (1,,/(n — m). Then rewrite all these fractions as p»/q 


"Here we use thatn = 1+ --- +14 0ink. 
‘Here we use again that chark = 0. 
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and put p = min(p,,). Therefore, for each m, we get the inequality gj, = p(n—m), 
which becomes an explicit equality for some m. The polynomial 


n—2 
Gtx) = F(A, Px) = Phx + Yo an (0) PMX” 
m=0 
n—2 
SP a ae > ?” (a, 1" + terms of higher degree in f) - x” 


m=0 


n—2 
= ff" (2 + Me pim—P™—™) (cy + terms of higher degree in t) - x” ) 


m=0 


is divisible by ” in k[¢] [x]. Write H(t, x) = >> h,,(t)-x” for the quotient G(t, x) /1?". 
Then the coefficient h,, in H has nonzero constant term for those m for which the 
equality gj4m = p(n — m) is achieved. Hence, the polynomial H(t, x) is reducible, 
and therefore, there exist d € N and t(t) € k((#)) such that H (e t(t)) = Oink). 
Then for m = qd and 0(t) = f t(t), we have F(t", 0(t)) = F (4, tP t(t)) a 
"H(t", c(t)) = 0. Oo 


Proof (of Theorem 4.3) Given a polynomial f(x) = ao(t) + ai(t)x + +++ + ay(t) x” 
whose coefficients a;(t) are Puiseux series, write m for the common denominator of 
all exponents in all series a; and put t = uw”. Then a;(t) = a;(u’”") € k((u)), and by 
Lemma 4.5, the polynomial f acquires a root in the field k((s)) after the appropriate 
parameter change uv = s‘. Returning to the initial parameter t = s?”, we get a root 
of f in the field of Laurent series in t!/?”, which is a subfield in the field of Puiseux 
series. Oo 


Example 4.9 (Counterexample to Theorem 4.3 in Positive Characteristic) The 
proof of Lemma 4.5 essentially uses the assumption chark = 0, without which both 
Lemma 4.5 and Theorem 4.3 fail. To demonstrate this, put k = F,, and consider the 
equation x? — x = rf! over the field F,,((f)). Let us try to equip the power series 
x(t) = cyt + cof? + --+ with rational A; < Ay < -++ and nonzero c; € F,, to 
solve the equation. Since c? = c for all c € F,, the substitution of x = x(f) into 
tx? — tx = | leads to 


a@rl! Ser aqr | +oF"' =er?" + higher powers of t = 1. 


The lowest term c;#!*! can be canceled only by the | on the right-hand side. Hence, 
4, = —1 and c; = —1. The next two terms have to cancel each other. Hence, 
Ao = —p! and cz = c,. The next two terms also should kill each other. Therefore, 
A3 = —p~* and c3 = Cp, etc. We conclude that 
x(Q) =e et i 
k>0 


This fractional power series is not a Puiseux series, because its exponents do not 
have a common denominator. 
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4.5.2 Newton’s Method 


Consider the monic polynomial equation 
F(t,x) = x" + an-i() x” | + ++» +a) =0 (4.40) 


with coefficients a;(t) € k[#] such that ao(0) = 0, which means that the initial 
polynomial F(0,x) € k[x] has a zero root. Newton’s method allows us to compute 
all solutions x(t) of (4.40) such that x(0) = 0. If € € k is a nonzero root of the 
initial polynomial F'(0, x), then to find all solutions x(f) starting with x(0) = & by 
Newton’s method, one should first shift the variable by the substitution x 4 x + &. 
An arbitrary polynomial equation with coefficients in the field of Puiseux series can 
be reduced to (4.40) by the standard manipulations used in the proof of Lemma 4.5. 

Essentially, Newton’s method visualizes the main step in the proof of Lemma 4.5. 
Let us depict by an integer point (p, g) in the coordinate plane each monomial!? x? 4 
appearing in F(t, x) with nonzero coefficient. The convex hull of all these points is 
called the Newton polygon of the polynomial F(t,x). The broken line formed by 
all the edges of the Newton polygon visible from the origin is called the Newton 
diagram of F. Since F is monic and ao(0) = 0, the Newton diagram does not 
contain the origin and has its endpoints on the coordinate axes. All integer points 
(Mm, [Lm) of the Newton diagram depict the terms of lowest degree in t of some of the 
coefficients '* 


Am(t) = oy, t’" + higher powers of t 
in F’. For example, Fig. 4.1 shows the Newton polygon of the polynomial 
(=f 4) Or ete I ee (4.41) 


Its Newton diagram consists of two edges perpendicular to the vectors (1,1) and 
(1,3), 


We are going to compute successive positive rational numbers ¢, €2, ... and 
nonzero elements c), cz, ... of k such that the power series 
x(t) = cyt"! + cot F ao afi te2tes 4 ...= 7%! G + 1 (co + 3 (c3 4+... -))) 
(4.42) 


extends the zero root of F'(0, x) in the field k to some root of F(t, x) in the field of 
Puiseux series. 


'3Note that the exponent of x grows along the horizontal axis. 
'4But not all, in general. 
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0 ut 2 3 4 5 


Fig. 4.1 The Newton polygon of (—f 4+ ft) — 2? x— tx? + 2taxt +9 


When we substitute the series (4.42) for x in F (t,x), the product of the terms of 
lowest degree in ¢ of the series x(t) and a,,(t) is equal to 


By erie, (4.43) 


Several such terms will cancel each other if they have the same degree in f, i.e., 
the same value of me, + [tm. This happens if and only if the exponents of the 
monomials (4.43) lie on the same line pe, + g = const containing some edge of the 
Newton diagram. At the first step of Newton’s method, we choose some edge Z. Let 
the integer vector vz = (6, 62) be perpendicular to Z and satisfy GCD(6;, 62) = 1. 
We put €; = 6)/5, the slope of vz. Then we substitute 


x(t) = c,f*! + higher powers of t 


for x in F and choose c; € k such that all monomials (4.43), which have the lowest 
degree in f, turn out to be canceled in F(t, x(t)). These monomials are depicted by the 
points (p, g) lying on the edge Z. Thus, it is reasonable to collect the monomials in 
F(t, x) along the lines 6,p + 62g = const, that is, write F as F(t,x) = Vy Silt, x), 
where y € N is the value of the linear form 6,p + 62q on Z, i.e., such that eyp-+qg = 
y/62, and 


Silt,x) = » Up gx tt . (4.44) 


5\pt+d2q=i 
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The term of f(t, x(t)) of lowest degree in t equals ’/*2f,(1,c,). Thus, the minimal 
power #”/® in F(t,x(t)) disappears if and only if c, is a root of the polynomial 
f,(,x) € klix]. Different roots lead to different Puiseux series (4.42) starting from 
cyt! 5 


Exercise 4.16 Write A for the abscissa of the left endpoint of Z. Verify that x* 
divides f, (1, x) in k[y]. 


At the second step of Newton’s method, we substitute f°'(c; + x) for x in the 
polynomial F(t, x). The resulting polynomial G(t, x) = F(t, ©! (cy + x)) is divisible 
by ”/®2, We write F\ (t,x) for the quotient and repeat the first step with F instead of 
F.. This leads to the next exponent €9, the next coefficient c2, and the next polynomial 
F5, etc. 

For example, let us find the roots of the polynomial (4.41). The normal vector to 
the left edge of the Newton diagram in Fig. 4.1 has coordinates (1, 1) and leads to 
€, = 1. Lying on this edge are the monomials —?°, —2??x, —tx? of F. Thus, y = 3 
and f3(1,x) = —1 — 2x —x? = —(x + 1) has just one root c; = —1. Substituting 
t (x — 1) for x in (4.41) and canceling the factor f°, we get 


a re (14-9) 14s) = 64 3 E14 Pr hr 3 ee 
(4.45) 
as F(t, x). Its Newton polygon is shown in Fig. 4.2. 


0 1 2 3 a 5 


Fig. 4.2. The Newton polygon of (t+ #7) — 3x + (-14+ 2°)x° + 2P 8 -—3Paxt+Px 


The Newton diagram consists of just one edge with normal vector (1,2). So 
€) = 1/2. The lowest term of (4.44) is f(t,x) = t — x’. Thus, c2 is a root of the 
polynomial f:(1,x) = 1 — x’. There are two possibilities: c) = +1. Let us take 
Co = | first. After substituting x < r'/?(1 + x), the polynomial (4.45) becomes 


GP 4aee) 9 ee a SOP ee ee, 


Here the Newton diagram consists of one edge joining (0, 1) and (1, 0). 


Exercise 4.17 Show that carrying out additional steps of Newton’s method will not 
change the Newton diagram. 
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We conclude that the root we obtain in this way is a power series in ¢!/?. If 
we write it with indeterminate coefficients and substitute in F’, then we get explicit 
recurrence formulas allowing us to compute the coefficients one by one. 


Exercise 4.18 Check that the root corresponding to the choice cp = —1 in the 
previous step is also a power series in r!/*. 


Returning to the first step, consider the segment in the Newton diagram of 
Fig. 4.1 with endpoints (2, 1) and (5,0). The normal vector of this edge is (1, 3). 
Thus, ¢; = 1/3. The coefficient c; is a root of fs(1,x)/x? = —1 + x°. We can 
choose among the three values c; = 1, o, w”, where w € kisa primitive cube root 
of unity. Let us take c) = @. After the substitution x 4 ¢!/3(@ + x) in (4.41) and 
canceling the factor 1°/3, we get the polynomial 


(14/3 + 1779) + (30 + 67/9) x + (9 + 1271/9) x? + (10? + 81/9) x3 
+ (5 + 2°) x4 42°. 


Its Newton diagram is again one edge joining (0, 1) to (1, 0). Thus, the current root 
is a power series in t!/3. Choosing the two remaining values of cz, we arrive at a 
similar result. Thus, the five roots of the polynomial (4.41) are two power series in 
t!/2, which begin with —t + 49/2 + ---, and three power series in t!/3 with initial 
terms t!/3, wt!/3, w2!/3, 

Let us make some final remarks on the general case. Write £(Z) for the length of 
the horizontal projection of the edge Z. It is clear that the denominator of ¢; = 6,/ 
5 is not greater than £(Z). The degree of the polynomial f, (1, x)/x*, whose root 
is c;, equals the number of segments into which Z is broken by the integer points. 
Therefore, it is also not greater than £(Z). If c, is a root of multiplicity d, then 
f(x) = («—c1)4g(x)x*, where g(c) # 0, and 


fpd.e. +x) = x"c4 g(c1) + higher powers of x 


contains the nonzero term g(c;)cix“, depicted by (d,0). Therefore, the Newton 
diagram of the next polynomial F(t, f'!(c; + x))/t”/ meets the horizontal axis at 
(d, 0) or to the left of it. This means that the length of the horizontal projection of 
any edge of the diagram is at most d, the multiplicity of the root c,. In particular, 
the next exponent after a simple root c; has to be an integer. 


Proposition 4.5 Every output series x(t) of Newton’s method applied to the 
polynomial F(t, x) is a Puiseux series that satisfies the equation’ F(t,x(t)) = 0 
in k[¢]. 


Proof Let us show that the exponents of an output series x(t) have a common 
denominator. It is enough to verify that all ¢; but a finite number are integers. As we 


'SNote that this gives another proof of Lemma 4.5. 


100 4 Elementary Functions and Power Series Expansions 


have just seen, the denominator of ¢;; is at most £ (Z;4), which is not greater than 
the multiplicity of the root c; chosen in the previous step. In turn, this multiplicity is 
at most £ (Z;). We obtain the inequality €(Z;+,) < €(Z;), which will be an equality 
only if c; is a root of multiplicity ¢ (Z;). Since the degree of f(x, 1) /x* is at most 
£(Z;), such a multiplicity of c; is possible only for 


fix, D/t =a-e—c)@, aek,a £0. 


In this case, the horizontal projections of integer points lying on Z; cover all integers 
in the range from A to A + £ (Z;). This forces the normal vector of Z to be of the 
form nz = (6,1) for 6 € N. Hence, e;4; = 5 € N in this case. We conclude that 
either €(Zj41) < €(Z;) or e:41 is an integer. Therefore, all the exponents ¢; after 
some finite number of iterations will be integers. 

Now the equality F(t,x(t)) = 0 can be verified easily. After the ith step of 
Newton’s algorithm, we are sure that the degree of the lowest term in the series 
F(t, x(t)) is at least the sum of all the fractions y/52 appearing in the first i steps. 
Since we always have y > 1 and 6) = 1 after a finite number of steps, this sum is 
unbounded from above. Oo 


Problems for Independent Solution to Chap. 4 


Problem 4.1 Compute the antiderivative and the 1000th derivative of the function 
x4/1 +2). 

Problem 4.2 Write the nth coefficients of the following power series as explicit 
functions of n: 


(a) (2x2 — 3x4 1)7!, (b) (xt + 223 — 722 —20x— 12)! (©) 3/14 2x, 
(d) 1/1 — 3x, (e) cosh(x) & (e* + e*)/2, (f) sinh(x) & (e* — e~*)/2, 
(g) cos(x) = (e* + e~*)/2, (h) sin(x) £ (e® — e~*)/2i. 


Problem 4.3 Express a, as an explicit function of k for the following sequences: 


(a) do = 1l,ay = —1, ay = 2ay_) — ay—z for all k = 2, 

(b) do = 1, a) = g, ag = (1 + G)ag—1 — Gax—2 for all k > 2, where g € C, 

(c) dg = 0, ay = 1, ay = 5, a3 = 14, ay = 4ay_) — 6ay—2 + 4ay—3 — ay—s for all 
k> 4. 


Problem 4.4 Let g(x) = [[(x — a;), where all a; are distinct. For a polynomial 
f €k[x] such that deg f < deg g, prove the following partial fraction expansion!®: 


i d 
f(x)/8@) = ee die noe where g’ = ae 


'6See Proposition 4.1 on p. 76. 
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Problem 4.5 (Taylor Expansion) Let k be a field of zero characteristic. Given a 
point a € k andn + 1 values bo, bi,...,b, € k, construct a polynomial f € k[x] 
such that deg f < n, f(a) = bo, (d/dx)'f(a) = b; for alli = 1, ... ,n, and prove 
that such a polynomial is unique. 

Problem 4.6 Show that all coefficients of the power series tan(x) & 

are positive. 


Problem 4.7 Show that e* € Q|x] ~ Q(x). 
Problem 4.8 Find an f € Q|x] ~ Q(x) whose nonzero coefficients all are equal to 1. 


sin(x)/ cos(x) 


Problem 4.9 Write p,,(n) for the total number of Young diagrams consisting of n 
cells and at most m rows. Also put p,,(0) & 1. Express p,,(n) in terms of p,»—1(n) 
and p(n — m). Show that the generating series P,,(x) = )?,59 Pm(n) x" € Qs] 
is a rational function. 

Problem 4.10 (Euler’s Pentagonal Theorem) Write p(7) for the number of Young 
diagrams of weight!” n and Peven(n) (respectively Poaa(n)) for the number of 
weight-n Young diagrams consisting of an even (respectively odd) number of 
rows of distinct lengths. Put p(0) “ 1 and consider the generating series P(x) = 
neo P(n)t” € Qlx]. Show that 

(a) P(x) = Tlie = arty (b) 6) =1 + ee (Peven(”) — Poaa(n)) ie 

(©) p(n) = Ves)" (p (n— Bk? — k)/2) + p(n— BK +4)/2)) = 
p(n — 1) + pr — 2) — p(n — 5) — pt — 7) + p(n — 12) + p(n — 15) — +, 
and evaluate p(10). 

Problem 4.11 A triangulation of a convex n-gon by some of its diagonals is 
admissible if the diagonals do not intersect each other anywhere except at the 
vertices of the n-gon. For example, the 4-gon, 5-gon, and 6-gon admit 2, 5, 
and 12 such triangulations. How many admissible triangulations are there for an 
arbitrary n? 

Problem 4.12 Write i,, for the total number of monic irreducible polynomials of 
degree!® m in F,,[x]. Prove that (1 — px)! = [],,en(1 — x") in QU]. 

Problem 4.13 Express the differentiation operator 4 : Ql > Qld, f rb f’, asa 
power series in the difference operator V : Q[f] > Q[f], f(O BO fM-ft—1), 
i.e., find a power series YW € QJx] without a constant term such that Y(V) = 4. 

Problem 4.14 Verify that the polynomials 


t+k 1 
Yo =1 and now i = pet De+2) --/(t+k) fork >0O (4.46) 


satisfy the relations V"y,y = yx—, and use this to prove that every polynomial 
f € Q{f] can be linearly expressed through polynomials (4.46) by the formula 


"That is, consisting of n cells; the number p(7) is also called the nth partition number; see 
Example 1.3 on p. 6. 


'8Compare with Problem 3.12 on p. 67. 
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f = >, Vf) - y%. Also show that the coefficients c, € Q of every linear 
expression f = )-, cy: yz have to be cq, = V*f(-1). 

Problem 4.15 Write T, & e~” : f(x) + f(x — w) for the shift-by-a operator!? and 
put 7 = T7;. Prove that the following properties of the linear map’ F : Q[f] > 
Qt] are equivalent: 


(l)FeV=VeF, Q)FeT=TeF, (3)VaeQ, FeTy=Ty°F, 
(4)4¢e Qh]: F=OD), ()3IVE Qh]: F= WV). 


Problem 4.16 Prove that the Bernoulli numbers B2, are positive for odd k > 1 and 
negative for even k = 2. 


Problem 4.17 Write the first three nonzero terms of the Puiseux expansion over C 
for all roots x(t) of the following polynomials: 


(a) + )—Px+tx —2, 

(b) P+ (-t4+ 2?)x4+%, 

(ce) t—Pxt+ 372 x3 — 3tx? +27, 

@) (2 +46 + 64) — 44x 4+ (2-42 — 2°) 2 + x4, 
(e) 2P —P x4 2P x —tx +29. 


'SCompare with Example 4.7 on p. 89. 

The map F : Q[t] > Q[d] is linear if for all w, B € Q and all f, g € Q[#], the equality F(a +f + 
B+ g) = aF(f) + BF(g) holds. For example, all the maps D = 4, D*, ®(D) for all ® € Qi], 
V, V¥, and ®(V) are linear. 


Chapter 5 
Ideals, Quotient Rings, and Factorization 


In this section we continue to use the notation of Chaps. 3 and 4 and write K for an 
arbitrary commutative ring with unit and k for an arbitrary field. 


5.1 Ideals 


5.1.1 Definition and Examples 


A subring J in a commutative ring K is called an ideal if for every a € J andf € K, 
we have fa € K. We have seen in Sect. 2.6.4 that the kernel of a ring homomorphism 
gy : K — Lisan ideal in K. For every a € K, all multiplies of a form an ideal 


(a) = {fa| f € K} (5.1) 


called the principal ideal generated by a. We used principal ideals in the construc- 
tions of the residue rings Z/(n) and k[x] / (f) , where the principal ideals (n) C Z 
and (f) € k[x] were the kernels of the quotient homomorphisms Z —> Z/(n), 
m +> [ml], and k[x] — k[x]/(f), g > [gl]. Every commutative ring K with unit 
has the trivial ideals (0) = {0} and (1) = K. 


Exercise 5.1 Prove that for every commutative ring K with unit, the following 
properties of an ideal J C K are equivalent: (a) J contains an invertible element 
of K,(b) 1 e/,(e)1=K. 


Proposition 5.1 A commutative ring K with unit is a field if and only if there are 
no nontrivial ideals in K. 


Proof An ideal in a field is trivial by Exercise 5.1. Conversely, if a nonzero ideal 
coincides with K, then (b) = K for every b # 0. Hence, 1 = sb for some s € K. 
Thus, all b ¥ 0 are invertible. Oo 
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5.1.2 Noetherian Rings 


Every subset M C K generates an ideal (M) C K formed by all finite sums b)a; + 
boa + +++ + bmam, where a1, 2,...,dm € M, bj, bo,...,bm € K,m EN. 


Exercise 5.2 Verify that (/) C K is an ideal. 


Every ideal J C K is generated by some subset M C K, e.g., by M = J. An ideal 
IC Mis said to be finitely generated if it admits a finite set of generators, that is, if 
it can be written as 


i= (a1, 2,..., 4%) = {bay + boag +--+ + bya | b; € K} 


for some aj, d2,...,a,% € I. We met finitely generated ideals when we constructed 
the GCD in the rings Z and k[x]. 


Lemma 5.1 The following properties of a commutative ring K are equivalent: 


(1) Every subset M C K contains some finite collection of elements ay, dz,..., Ax € 
M such that (M) = (a, d2,..., Ax). 

(2) Every ideall C K is finitely generated. 

(3) For every infinite chain of increasing ideals I, © Ih 
exists n € N such that I, = I, forall v = n. 


Proof Clearly, (1) > (2). To deduce (3) from (2), write 7 = (J /, for the union of 
all ideals in the chain. Then / is an ideal as well. By (2), J is generated by some finite 
set of its elements. All these elements belong to some [,. Therefore, J, = J = [, 
for all v => n. To deduce (1) from (3), we construct inductively a chain of strictly 
increasing ideals J, = (a1, d2,...,@,) starting from an arbitrary a; € M. While 
I, 4 (M), we choose any element a4; € M ~ J, and put 41 = (az U i). Since 
Ik | Ip41 in each step, by (3) this procedure has to stop after a finite number of 


- 


steps. At that moment, we obtain J, = (a), d2,...,4m) = (M). oO 


I; C --- in K, there 


In 


Definition 5.1 A commutative ring K is said to be Noetherian if it satisfies the 
conditions from Lemma 5.1. Note that every field is Noetherian. 


Theorem 5.1 (Hilbert’s Basis Theorem) For every Noetherian commutative ring 
K, the polynomial ring K|x] is Noetherian as well. 


Proof Consider an arbitrary ideal J C K[x] and write Lg C K for the set of leading 
coefficients of all polynomials of degree < d in J including the zero polynomial. 
Also write Log = Ug Ly for the set of all leading coefficients of all polynomials in J. 


Exercise 5.3 Verify that all of the Lg and Loy are ideals in K. 


Since K is Noetherian, the ideals Lg and Loo are finitely generated. For all d 
(including d = oo), write re ; te sane Fo € Kx] for those polynomials whose 
leading coefficients span the ideal Ly C K. Let D = max deg | als . We claim that the 


polynomials f°° and Fi ford < D generate J. Let us show first that each polynomial 
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g € J is congruent modulo f; ion) sees sft fie to some polynomial of degree less 
than D. Since the leading coeticient of g lies in Log, it can be written as )> Ajai, 
where A; € K and a; is the leading coefficient of | is . As long as deg g = D, all 
differences m; = deg g—deg pte are nonnegative, and we can form the polynomial 
h=e-)>> Pee Aa) (x) +x", which is congruent to g modulo J and has degh < deg g. 
We replace g by h and repeat the procedure while degh > D. When we come to a 
polynomial h = g(mod J) such that degh < D, the leading coefficient of h falls 
into some Ly with d < D, and we can cancel the leading terms of h by subtracting 
appropriate combinations of polynomials fo forO <d < Duntilwegeth=0. O 


Corollary 5.1 For every Noetherian commutative ring K, the ring K|x,,X2,...,Xnl 
is Noetherian. oO 
Exercise 5.4 Show that the ring K[x),.x2,...,Xn]] is Noetherian for every Noethe- 


rian commutative ring K. 


Corollary 5.2. Every infinite system of polynomial equations with coefficients in a 
Noetherian ring K is equivalent to some finite subsystem. 


Proof Since K[x1,X2,...,Xn] is Noetherian, among the right-hand sides of a 
polynomial equation system 


Sv (1,%2,-+-,Xn) = 0 


there is some finite collection f|,fo,...,f, that generates the same ideal as all the 
j.. This means that every f, is equal to gifi + gaf2 +--+ mn for some g; € 
K[x,,x2,...,Xn]. Hence, every equation f, = 0 follows from ff, = fp = ++: = 
tin = 9. Oo 


Example 5.1 (Non-Noetherian Rings) Consider a countably infinite set of variables 
x; numbered by i € N and define the polynomial ring Q[x1, x2,x3, ...] in these 
variables to be the set of all finite sums of finite monomials x71! x7)? «+ - xj" multiplied 
by arbitrary rational coefficients. The ideal 7 spanned by the set of all variables 
consists of all polynomials without a constant term. It coincides with the union of 
strictly increasing ideals (x1) C (x1,%2) C (%1,%2,%3) C (01, %2,%3,X4) C cee, 
forming an infinite tower. 


Exercise 5.5 Verify that (x1,%2,...,%n) & (%1,%2,..-,Xn41)- 


Thus, the ideal J is not finitely generated, and the ring Q[x),x2,x3, ...] is not 
Noetherian. The less artificial rings C(R) and C®(R) of all continuous and all 
infinitely differentiable functions f : R — R are also not Noetherian. 


Exercise 5.6 Write J, C C°(R) for the set of all functions vanishing on ZN [—K, k]. 
Verify that Ip © I) © lb & --+ form an infinite chain of strictly increasing ideals in 
c~(R). 
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Caution 5.1 A subring of a Noetherian ring is not necessarily Noetherian. For 
example, the ring of formal power series C]z] is Noetherian by Exercise 5.4, 
whereas the subring formed by all analytic functions! C — C is not. 


Exercise 5.7 Verify this by arguing as in Exercise 5.6. 


5.2 Quotient Rings 


5.2.1 Factorization Homomorphism 


Let a commutative ring K be equipped with an equivalence relation ~ that 
decomposes K into equivalence classes. Write X for the set of these classes and 
consider the quotient map sending an element a € K to its equivalence class [a] € X: 


w:K—->xX, at fal. (5.2) 


Let us ascertain whether X admits the structure of a commutative ring such that the 
map (5.2) becomes a homomorphism of commutative rings. This means that the 
operations 


[a] + [b] = [a+], [a]: [b] = [ad] (5.3) 


provide X with a well-defined commutative ring structure. If so, then the zero 
element of X is the class of zero [0]. Therefore, [0] = kerma C K should be an 
ideal in K, and by Proposition 2.1 on p.32, every fiber [a] = 2~!(z(a)) of the 
homomorphism (5.2) is the parallel shift of [0] by a: 


VaeK, [a]J=a+ [0] ={a+)| be [0}}. 


In fact, these necessary conditions are also sufficient: for every ideal J C K, the 
congruence modulo I relation aj = az (mod J), meaning that a, — a2 € J, is an 
equivalence on K. It decomposes K into a disjoint union of residues” modulo I, 


[a], fa+1={fat+b|beh, (5.4) 


and the operations (5.3) provide the set of residues X with the well-defined structure 
of a commutative ring with zero element [0]; = J and unit element [1]; (if there is a 
unit 1 € K). 


Exercise 5.8 Check that congruence modulo / is an equivalence relation and verify 
that the operations (5.3) are well defined on the equivalence classes (5.4). 


‘That is, all power series converging everywhere in C. 
*Also called cosets of the ideal J. 
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Definition 5.2 For a commutative ring K and ideal / C K, the ring of residues (5.4) 
equipped with the operations (5.3) is called the quotient ring (or factor ring) of K 
modulo / and is denoted by K/J. The surjective homomorphism of rings 


K—K/I, at [a);, (5.5) 


is called the quotient homomorphism. 


For example, the residue rings Z/(n) and k[x]/(f) are the quotient rings of Z and 
k[x] modulo the principal ideals (n) C Z and (f) C k[y]. 


Example 5.2 (Image of a Ring Homomorphism) It follows from Sect. 2.6.4 on p. 33 
that the image of a ring homomorphism g : K; — K> is isomorphic to the quotient 
ring K,/ ker(gy). The isomorphism sends an element b = g(a) € im¢ to the coset 
[a]ker-y = gy '(b). Therefore, every homomorphism of rings g : Kj — K> can 
be decomposed as the quotient epimorphism K,; —> K,/ker@ followed by the 
monomorphism K,/kerg ~ img @ kK). 


Exercise 5.9 Show that every quotient ring of a Noetherian ring is Noetherian. 


5.2.2 Maximal Ideals and Evaluation Maps 


An ideal m C K is said to be maximal if the quotient ring K/m is a field. This is 
equivalent to saying that m is maximal among the proper ideals? partially ordered by 
inclusion. Indeed, the nontriviality axiom 0 4 1 holds in K/m if and only if 1 € m. 
The class [a] 4 [0] is invertible in K/m if and only if there exist b € K, x € m such 
that ab = 1 + x in K, meaning that the ideal spanned by m and a € K ~ m contains 
1 and coincides with K. Thus, the invertibility of a nonzero class in K/m means that 
m cannot be enlarged within the class of proper ideals by adding an element from 
K~nm. 

For a proper ideal J C K, write Z(J) for the set of all proper ideals J D J. It is 
partially ordered by inclusion and complete,’ because every chain of ideals J, € 7 
has an upper bound J = UJ,. 


Exercise 5.10 Check that J is a proper ideal of K. 


By Zorn’s lemma, Lemma 1.3, Z(J) contains a maximal element, and the above 
arguments show that every such element is a maximal ideal in K. Therefore, every 
proper ideal is contained in some maximal ideal. Note that every field contains just 
one maximal ideal, the zero ideal (0). 


3That is, among ideals different from K itself. 
4See Definition 1.2 on p. 16. 
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Maximal ideals appear in rings of functions as the kernels of evaluation maps. 
Given a set X and a field k, write K = k* for the ring of all functions f : X > k. 
Then associated with an arbitrary point p € X is the evaluation map sending the 
function f : X — k to its value at p: 


ev,:K—>k, frf(p). 


It is obviously surjective. Hence, in accordance with Example 5.2, K/kerev, ~ 
imev, = k isa field. Therefore, kerev, = {f € K | f(p) = 0} is a maximal ideal 
in K. 


Exercise 5.11 Show that every maximal ideal in C[x] coincides with kerev, for 
some p € C and determine a maximal ideal m C R[x] different from all the ideals 
ker ev, for p € R. 


Exercise 5.12 Show that the maximal ideals in the ring of continuous functions 
[0, 1] — R are exhausted by ker ev, for p € [0, 1]. 


5.2.3. Prime Ideals and Ring Homomorphisms to Fields 


An ideal p C K is said to be prime if the quotient ring K/p has no zero divisors. In 
other words, p C K is prime if for all a,b € K, the inclusion ab € p implies that 
a € porb€p. For example, the principal ideals (p) C Z and (qg) C k[x] are prime 
if and only if p € Z is a prime number and g € k[x] is an irreducible polynomial. 


Exercise 5.13 Prove the last two assertions. 


It follows from the definitions that every maximal ideal is prime. The converse is 
not true in general. For example, the principal ideal (x) C QJx, y] is prime but not 
maximal, because Q|x, y]/(x) ~ Qly] is an integral domain but not a field. The 
prime ideals of a ring K are the kernels of nonzero homomorphisms 


g:K—-k, 
where k is an arbitrary field. Indeed, the image img ~ K/kerg of any such 


homomorphism is an integral domain, because the ambient field k is. Conversely, if 
K/p is an integral domain, then it admits a canonical embedding 


1: K/p > Ox/p 


into its field of fractions. The quotient map z : K —» K/p followed by this 
embedding is a homomorphism of rings iz : K — Qx/py with kernel p. 
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Exercise 5.14 Assume that a prime ideal p contains the intersection of ideals 
I, hb, ...,Jm. Prove that at least one of the ideals J, is completely contained in p. 


5.2.4 Finitely Generated Commutative Algebras 


Given a commutative ring K with unit, a quotient ring of the form 
A= K[x1,%2,...,Xal/T, 


where IJ C K[x1,x2,...,X,] is an arbitrary ideal, is called a finitely generated K- 
algebra. The residue classes a; = x; (mod /) are called generators of A. Polyno- 
mials f € I are called relations between the generators. Informally, a K-algebra 
A consists of all polynomial expressions produced by means of commuting letters 
a1, 42,...,d, and the elements of K using addition, subtraction, and multiplication 
in the presence of polynomial relations f(a), a2,...,d,) = 0 forall f € J. 


Corollary 5.3 Every finitely generated commutative algebra A over a Noetherian 
ring K is Noetherian, and all polynomial relations between generators of A follow 
from some finite set of those relations. 


Proof The result follows immediately from Exercise 5.9 and Corollary 5.1. oO 


5.3 Principal Ideal Domains 


5.3.1 Euclidean Domains 


Definition 5.3 (Principal Ideal Domain) An integral domain’ K is called a prin- 
cipal ideal domain if every ideal in K is a principal ideal, that is, if it is generated 
by a single element. 


For example, Z and k[x] are principal ideal domains. We essentially proved this 
when we constructed the GCD in these rings.’ The key point in the proofs was a 
division with remainder argument, which can be formalized as follows. 


Definition 5.4 (Euclidean Domain) An integral domain K is called Euclidean if K 
is equipped with a degree function® v : K~{0} > Zso such that for all a, b € KX.O, 


5Or more solemnly, a finitely generated commutative algebra over K. 

That is, a commutative ring with unit and without zero divisors (see Sect. 2.4.2 on p. 28). 
7See Sect. 2.2.2 on p. 24 and Proposition 3.4 on p. 47. 

8 Also called a Euclidean valuation. 
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the inequality v(ab) = v(a) holds and 
dg,r€K : a=bq+randeither v(r) < v(b) orr=0. (5.6) 


The elements q, r in (5.6) are called the quotient and remainder of the division of a 
by b. Note that the definition does not require the uniqueness of q, r for given a, b. 


Exercise 5.15 Verify that the following rings with degree functions are Euclidean: 
(a) Z, v(z) = |z|, (b) k[x], v(f) = degf, 
(ce) Zi] = {a+ bie Cl a,beZ,? =—-1}, v(2) = |Z’, 
(d) Z[o] = {a+ bo €ClabeZ,w+o4+1=0}, v(z) = lz’. 


Theorem 5.2 Every Euclidean domain K is a principal ideal domain. 


Proof Given a nonzero ideal J C K, write d € I for some nonzero element of lowest 
degree. Clearly, (d) C J. Eacha € J can be written as a = dq+r, where either r = 0 
or v(r) < v(d). The latter is impossible by the choice of d, because r = a—dq € I. 
Hence, r = 0, and therefore J C (d). oO 


Corollary 5.4 The rings Z, kx], Z[i], Z[w] from Exercise 5.15 are principal ideal 
domains. 


Caution 5.2 There are non-Euclidean principal ideal domains. Among the simplest 
examples are the ring R[x, y]/(x* + y? + 1) and the ring of algebraic numbers of the 
form (x + y¥—19)/2, where x,y € Z are either both even or both odd. However, 
a thoroughgoing discussion of such examples requires advanced techniques from 
number theory and geometry that would take us outside the scope of this book. 


Exercise 5.16 Let K be a Euclidean domain. Prove that a nonzero element b € K 
is invertible if and only if v(ab) = v(a) for all nonzeroa € K. 


5.3.2. Greatest Common Divisor 


Let K be an arbitrary principal ideal domain. For a finite set of elements 
a,42,...,dn © K, there exists an element d € K such that d divides each a; 
and is divisible by every common divisor of all the a;. It is called a greatest 
common divisor and denoted by GCD(a), a2,..., dn). By definition, d is a greatest 
common divisor of a,,a2,...,d, € K if and only if (d) = (a1,q@,...,dn) = 
{xpay + x2€2 + +++ + Xndn | x; € K}. The following exercise shows that any two 
greatest common divisors differ by an invertible factor. 
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Exercise 5.17 (Associated Elements) Prove that the following properties of 
nonzero elements a, b of an integral domain K are equivalent: (a) (a) = (b), 
(b) a | band b | a, (c) a = sb for some invertible s € K. 


Elements a, b satisfying the conditions of Exercise 5.17 are called associates. 
For example, integers a,b € Z are associates if and only if a = +b. Thus, the 
notation GCD(q), d2,...,@,) means a class of mutually associated elements. Note 
that every greatest common divisor d, as an element of the ideal (a), a2,...,dn), 
admits a representation d = xa; + X2a2 + +++ + Xndy, where x; € K. 


5.3.3 Coprime Elements 


It follows from the previous section that the following conditions on a collection of 
elements a), d2,...,@, in a principal ideal domain K are equivalent: 


¢ Every common divisor of aj, a2,..., Gy is invertible. 
© x1d) +x2d. +--+ + Xd, = 1 for some x; € K. 
© (a1, 42,...,4n) = K. 


Elements a), d2,..., a, Satisfying these conditions are called coprime.? 


5.3.4 Irreducible Elements 


Recall!° that a noninvertible element q € K is irreducible if the factorization 
q = ab is possible only if a or b is invertible. Equivalently, g € K is irreducible 
if and only if the ideal (g) is maximal among the proper principal ideals partially 
ordered by inclusion. In a principal ideal domain, two irreducible elements p, g are 
either associates or coprime. Indeed, (p,q) = (d) for some d, and there are two 
possibilities: either d is invertible or d is associated with both p and gq. In the first 
case, (p, g) > 1, and therefore p, g are coprime. 


Caution 5.3 In an integral domain K that is not a principal ideal domain, two 
unassociated irreducible elements are not necessarily coprime. For example, the 
polynomials x, y € Q|x, y] are irreducible, unassociated, and not coprime. 


Exercise 5.18 Check that the ideals (x,y) C QJ|x, y] and (2,x) € Z[x] are not 
principal. 


See Sect. 2.7 on p. 34. 
'0See Definition 3.3 on p. 48. 
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Proposition 5.2 The following properties of an element p in a principal ideal 
domain K are equivalent: 


(1) The quotient ring K/(p) is a field. 
(2) The quotient ring K/(p) has no zero divisors. 
(3) The element p is irreducible. 


Proof The implication (1) = (2) holds trivially for every commutative ring!! K. 
The implication (2) > (3) holds for every integral domain! K. Indeed, if p = ab 
in K, then [a][b] = 0 in K/(p). Since K/(p) has no zero divisors, one of the two 
factors, say [a], equals [0]. Therefore, a = ps = abs for some s € K. This leads 
to a(1 — bs) = 0 and bs = 1. Hence, b is invertible. It remains to establish the 
implication (3) = (1) for every principal ideal domain K. As soon as all ideals in K 
are principal, maximality of (p) among proper principal ideals means maximality 
among all proper ideals, which in turn means that K/(p) is a field, as we have seen 
in Sect. 5.2.2 on p. 107. oO 


5.4 Unique Factorization Domains 


Throughout this section, we denote by K an arbitrary integral domain, that is, a 
commutative ring with unit and without zero divisors. 


5.4.1 Irreducible Factorization 


Proposition 5.3 If K is Noetherian, then every noninvertible element a € K is a 
finite product of irreducible elements. 


Proof Tf a is irreducible, there is nothing to do. If not, we write a as a product of 
two noninvertible elements and repeat the process on each factor. If this procedure 
stops after a finite number of iterations, we get the required irreducible factorization. 
If not, we can form an infinite sequence of elements a = do, a), d2,... in which 
aj+1 divides a; but is not an associate of a;. This means that the principal ideals 
(ao) & (a1) & (a2) € --- form an infinite strictly increasing tower, which cannot 
exist in a Noetherian ring. Oo 


Definition 5.5 (Unique Factorization Domain) An integral domain K is called a 
unique factorization domain if every noninvertible element of K is a finite product 
of irreducible elements and every equality p; - po -+: Pm = 91° G2 °°: Gn between 
such products implies that m = n and the factors can be renumbered in such a way 


See Sect. 2.4.2 on p. 28. 
"Not necessarily a principal ideal domain. 
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that g, = pysSy for some invertible s, € K forall v. A factorization a = q1-q2 +++ Gn 
in which all q, are irreducible is called an irreducible factorization of a. Its factors 
qi are called irreducible factors of a. Note that such factors are defined only up to 
multiplication by an invertible element of K. 


Example 5.3 (Integers) The ring Z is a unique factorization domain by Exer- 
cise 2.8. In Z, each class of associated irreducible elements contains a canonical 
representative, the positive prime p. Choosing these canonical irreducible elements, 
we can provide each integer n with the canonical irreducible factorization n = 
+p) + p2 ++: Pm, Where pi,p2,...,Pm are nondecreasing positive prime numbers. 
Such a canonical factorization is uniquely determined by n, with no ambiguity 
whatsoever. 


Example 5.4 (Polynomials with Coefficients in a Field) For a field k, the polyno- 
mial ring k[x] is a unique factorization domain by Exercise 3.7. In k[x], each class of 
associated irreducible elements also contains a canonical representative, the monic 
irreducible polynomial. In choosing these canonical irreducibles, we can provide a 
polynomial f with an irreducible factorization f = cp, - p2 «++ Pm, Where c € k and 
P1,P2,--+,Pm are monic irreducible polynomials. Such a factorization is determined 
by f up to a permutation of the p;. 


Example 5.5 (Non-unique Factorization Domain) The ring Z| V5] # Z[x]/(x? —5) 
consists of elements a + b/5, where a,b € Z and \/5 © [x] satisfies /5- /5 = 5. 
It is isomorphic to the smallest subring of R containing Z and the number V5 € R. 
Let us verify that the following two factorizations of the number 4 within Z| V5], 


2-2=4=(V5+1)-(v5—1) (5.7) 


are distinct irreducible factorizations. In analogy with the complex numbers, let us 
call the elements 0 = a + by/5 and 3 = a— bs/5 conjugates and introduce the 
norm ||9|| = 0-3 = a* — 5b? € Z. 


Exercise 5.19 Verify that conjugation 3 +> # is a ring automorphism of Z[ V5]. 


The above exercise implies that the norm is multiplicative: ||1,02|| = 01020102 = 
||, || - |||]. In particular, 3 € Z| V5] is invertible if and only if ||| = +1, and in 
this case, 9~! = +9. Since ||2|| = 4 and | J5 + 1|| = —4, factorization of these 
elements as a product xy of noninvertible x, y forces ||x|| = |ly|| = +2. However, in 
Z[ V5] there are no elements of norm +2, because the equation a” — 5b” = +2 has 
no integer solutions. Indeed, the residues of both sides modulo 5 satisfy a7 = +2 
in Fs, but neither +2 nor —2 is a square in Fs. We conclude that 2 and J/5+1are 
irreducible in Z[ 5]. Since neither v5 + 1 and \/5 — 1 is divisible by 2 in Z[ V/5], 
the two irreducible factorizations (5.7) are actually different. 
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5.4.2 Prime Elements 


An element p € K is called prime if the principal ideal (p) C K is prime, that is, if 
the quotient ring K/(p) has no zero divisors. In other words, p is prime if and only if 
the condition p | ab implies that p | a or p | b. Every prime element p is irreducible. 
Indeed, if p = xy, then one of the factors, say x, is divisible by p. Hence, p = pyz 
for some z. This forces yz = 1 and means that y is invertible. 

We know from Proposition 5.2 on p. 111 that in a principal ideal domain, the 
converse is also true: every irreducible element is prime. However, in an integral 
domain that has nonprincipal ideals, an irreducible element may be not prime. For 
example, in Z[ V5] = Z|x]/(x* — 5), the element 2 is irreducible but not prime, 
because the quotient ring 


Z[V5]/(2) = Z[x]/(2,2? — 5) = Zh]/(2,x? +1) & Folx|/0? + YD 
~ Fyfx]/((@@ + 1’) 


has the zero divisor (x + 1) (mod (2,x? + 1)). In particular, this means that the 
irreducible number 2 does not divide 1 + V5 in Z[ V5] but divides (1 + V5)? = 


6494/5, 


Proposition 5.4 A Noetherian integral domain K is a unique factorization domain 
if and only if every irreducible element in K is prime. 


Proof Let K be a unique factorization domain and g € K irreducible. If g divides 
some product ab, then an irreducible factorization of ab contains a factor associated 
with g. On the other hand, an irreducible factorization of ab is the product of 
irreducible factorizations of a and b. Thanks to unique factorization, q is associated 
with some irreducible factor in a or in b. Therefore, g divides a or b. 

Now suppose all irreducible elements of K are prime. We know from Proposi- 
tion 5.3 that every element of a Noetherian ring admits an irreducible factorization. 
Let us prove that in every integral domain, an equality between two products of 
prime elements 


Pip2***Pk = 192°** Ym (5.8) 


implies that k = m and (after appropriate renumbering) g; = s;p; for some invertible 
s; for each i. Since p; divides the right-hand side of (5.8), one of the factors there, 
say qj, is divisible by p;. Therefore, g; = s1p;, where s; is invertible, because 
qi is irreducible. Since K has no zero divisors, we can cancel the factor p; on the 
both sides of (5.8) and repeat the argument for the shorter equality pop3---p, = 


(81q2)q3°** Qm- Oo 
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Corollary 5.5 Every principal ideal domain is a unique factorization domain. 
Proof This follows at once from Proposition 5.2 on p. 111 oO 


Example 5.6 (Sums of Two Squares, Continuation of Sect. 3.5.5) It follows from 
the previous corollary and Corollary 5.4 that the ring of Gaussian integers Z[i] is a 
unique factorization domain. Let us ascertain whether a given prime p € N remains 
irreducible in Z[i]. Since p = p and p has no real irreducible divisors that are 
not associates of p, all the factors in an irreducible factorization of p in Z[i] split 
into pairs of complex conjugate divisors. Therefore, a prime p € Z that becomes 
reducible in Z[i] can be written as p = (a + ib)(a — ib) = a* + b* with nonzero 
a,b € Z. In other words, a prime p € N is reducible in Z[i] if and only if p is a 
sum of two perfect squares. At the same time, a prime p € Z[i] is irreducible in Z[i] 
if and only if the quotient ring Z[i]/(p) ~ Z[x]/(p,x° + 1) = F,[x]/G? + 1) is 
a field.'!> This means that the quadratic binomial x? + 1 is irreducible in F [x], that 
is, has no roots in F,. We conclude that a prime p € Z is a sum of two squares if 
and only if —1 is a quadratic nonresidue modulo p. We have seen in Sect. 3.6.3 on 
p.65 that the latter is the case if and only if (p — 1)/2 is even, that is, for primes 
p= 4k+ landp =2. 


Exercise 5.20 Use Sect. 3.5.5 on p.62 and Example 5.6 to prove that a number 
n € Nisa sum of two perfect squares if and only if every prime factor of the form 
p = 4k + 3 appears in the prime factorization of n an even number of times. 


5.4.3, GCD in Unique Factorization Domains 


A finite collection of elements a), a2, ..., @ in a unique factorization domain K has 
a greatest common divisor,'* which can be described as follows. Choose q € K in 
each class of associated irreducible elements and write m, for the maximal power of 
q that divides all the elements a;. Then up to multiplication by invertible elements, 


GCD(a1,a2,...,4m) =] ]¢™. (5.9) 
q 


Since each a; is divisible by only a finite number of q’s, all but a finite number of 
exponents m, vanish. Therefore, the product on the right-hand side of (5.9) is finite. 
By construction, it divides each a;. Since K is a unique factorization domain, every 
common divisor of all the elements a; must divide the right-hand side of (5.9). 


'3See Proposition 5.2 on p. 111. 
'4See Remark 2.3 on p. 27. 
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5.4.4 Polynomials over Unique Factorization Domains 


Let K be a unique factorization domain with field of fractions!> Qx. The polynomial 
ring K[x] is a subring of the polynomial ring Qx|[x]. Given a polynomial 


f@) — aox" + ax"! tte t An-1X + An € K[x] , (5.10) 
we write cont(f) - GCD(do, a1, ..., dy) € K for the greatest common divisor of the 
coefficients and call it the content of the polynomial f. The content of a polynomial 
is defined up to multiplication by invertible elements. 


Lemma 5.2 For all f,g € K|x], the equality cont(fg) = cont(f) - cont(g) holds. 


Proof Since K is a unique factorization domain, it is enough to check that for every 
irreducible q € K, the following statement holds: g divides cont(fg) if and only 
if g divides cont(f) - cont(g). Since every irreducible element is prime in a unique 
factorization domain, the quotient ring R = K/(q) has no zero divisors. For the 
polynomial (5.10), write [f],(x) = [ao]gx" + [ai]gx” | + ++ + [an—t]qx+lanlg € REI 
for the polynomial whose coefficients are the residues of the coefficients of f modulo 
q. Then the map K[x] > R[x], f + [f],, is a ring homomorphism. Since R[x] is an 
integral domain, the product [fg], = [f],[g]lq equals zero if and only if one of the 
factors [f],, [g], equals zero. In other words, g divides all coefficients of fg if and 
only if g divides all coefficients of g or all coefficients of f. LB 


Lemma 5.3 (Simplified Form of a Polynomial) Every polynomial f(x) € Qx[x] 
can be written as 


f@) = = -Sica(®). (5.11) 


where frea € K[x], a,b € K, and cont(frea) = GCD(a, b) = 1. The elements a, b and 
the polynomial frea are determined by f uniquely up to multiplication by invertible 
elements of K. 


Proof Factor the lowest common denominator from the coefficients of f. Then 
factor out the greatest common divisor from the numerators. We get a number 
c € Qx multiplied by a polynomial of content 1 with coefficients in K. Denote 
this polynomial by frea € K[x] and write c as a simplified fraction a/b. This is the 
representation (5.11). Given an equality of the expressions for (5.11) in Qx[x], 


a _ c 
b : frea(x) = d : Srea (x) Fy 


then ad - frea(x) = bc - Syea(x) in K[x]. A comparison of the contents of both sides 
leads to the equality ad = bc. Since GCD(a,b) = GCD(c,d) = 1, the elements 


'See Sect. 4.1.2 on p.75. 
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a,c should be associates, as should also b,d. This forces frea(x) = 8;rea(x) up to 
multiplication by an invertible constant from K. Oo 


Lemma 5.4 (Gauss’s Lemma) Every irreducible polynomial f € K|x| remains 
irreducible in Qx|x]. 


Proof Since f is irreducible, cont(f) = 1. Let f(x) = g(x) - A(x) in Ox[x]. Write g 
and h in the simplified form (5.11) and simplify the resulting constant factor. Then 
we get the equality 


fQ) = = Brea) - Area) (5.12) 


where req, yea € K[x] have content 1, and a,b € K have GCD(a,b) = 1. By 
Lemma 5.2, 


cont(greqMrea) = CONt(Prea) - CONt(Ayea) = 1. 


Therefore, the right-hand side of (5.12) is the simplified expression (5.11) for f. 
By Lemma 5.3, both elements a, b are invertible in K, and f = 2tedhrea up to 
multiplication by invertible elements of K. Oo 


Theorem 5.3 [f K is a unique factorization domain, then the polynomial ring K[x] 
is a unique factorization domain as well. 


Proof Since the principal ideal domain Qx|x] is a unique factorization domain, 
every f € K[x] is factorized within Qx[x] into a finite product of irreducible 
polynomials f, € Qx|[x]. If we write each f, in simplified form (5.11) and reduce 
the resulting constant factor, we obtain the equality 


cont(f) * fred = =] [hoses (513) 


where all the firea € K(x] are irreducible of content 1, and a,b € K have 
GCD(a, b) = 1. Both sides of (5.13) have simplified form (5.11), because we have 
cont ([[fi.rea) = 1. Therefore, b = 1 and f = a[ [freq up to multiplication by 
invertible elements of K. To get an irreducible factorization of f in K[x], it remains 
to factorize a € K within K. Let us verify that the irreducible factorization in K [x] is 
unique. Consider the equality ajd2 +++ dx-P1p2++*Ps = bib2-++bm-qigz +++ qr, where 
dq, bg € K are irreducible constants and p,q, € K[x] are irreducible polynomials. 
Since irreducible polynomials have content 1, a comparison of the contents of both 
sides leads to the equality a;az---axy = byjb2-+-b,, in K. Since K is a unique 
factorization domain, k = m, and (after appropriate renumbering) a; = s;b; for some 
invertible s; € K. Hence, pip2---Ps = 91g2---q, in K[x] up to multiplication by 
invertible elements of K. Since p;, q; are irreducible in Qx|x] and Qx[A] is a unique 
factorization domain, we conclude that r = s and (after appropriate renumbering) 
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Pi = qi up to a constant factor. By Lemma 5.3, these factors have to be invertible 
elements of K. Oo 


Corollary 5.6 If K is a unique factorization domain,'® then the polynomial ring 
K[x1, X2,...,Xn] is also a unique factorization domain. oO 


5.5 Factorization of Polynomials with Rational Coefficients 


In this section, we discuss how an irreducible factorization of a given polynomial f € 
Q x] can be obtained in practice. It is quite reasonable to begin with a determination 
of the rational roots of f. This can be done using a finite number of tests. 


Exercise 5.21 Show that the simplified fraction a = p/q € Q can be a root of the 
polynomial ax” + ayx"~! + +++ ay—1x + dy € Z[x] only if p | ay and q | ap. 


An explicit knowledge of the complex roots is also extremely helpful. 
Exercise 5.22 Factorize x* + 4 as a product of two quadratic trinomials in Z [x]. 


After these simple considerations have been exhausted, one can invoke some 
deeper divisibility criteria. To apply them, we write f as c- g, where c € Q and 
g € Z[x] has content 1. Then by Gauss’s lemma,'’ the factorization of f in Q[x] is 
equivalent to the factorization of g in Z[x]. Let us analyze the factorization in Z[x]. 


5.5.1 Reduction of Coefficients 


For m € Z, there is a homomorphism of rings 
Z[x] > (Z/(m))b]. f > Ulm. (5.14) 
which sends a,x" + d,—x""! + ++» + a,x + ag € Z[x] to the polynomial 
[an] X" + [dn] H+ + [adm * + [4o]m 


with coefficients in the residue class ring Z/(m). The homomorphism (5.14) is 
called a reduction of the coefficients'® modulo m. Since the equality f = gh in 
Z{x] implies the equality [f],, = [g]n + [A]m in the ring (Z/(m))[x] for all m, the 
irreducibility of [f], for some m forces f to be irreducible in Z[x]. For prime m = p, 


'6Tn particular, a principal ideal domain or field. 
See Lemma 5.4 on p. 117. 
'8We have used this already in the proof of Lemma 5.2 on p. 116. 
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the residue class ring Z/(m) becomes the field F,, and the polynomial ring F,,[x] 
becomes a unique factorization domain. For small p and small degree of f, the 
irreducible factorization of f in F,,[x] can be carried out by simply running through 
all the irreducible polynomials in F,,[x]. An analysis of such a factorization allows 
us to say something about the factorization of f in Z[y]. 


Example 5.7 Let us show that the polynomial f(x) = x° + x? + 1 is irreducible in 
Z|x]. Since f has no integer roots, a nontrivial factorizationf = gh in Z[y] is possible 
only for deg(g) = 2 and deg(h) = 3. Reduction of the coefficients modulo 2 leads 
to the polynomial [f]2 = x»° + x? + 1, with no roots in F. This forces both [g]2 
and [h]2 to be irreducible in F>[x]. Since x7 +x + 1 is the only quadratic irreducible 
polynomial in F3[x] and it does not divide x° + x” + 1 in F2[x], we conclude that [f]2 
is irreducible in F [x]. Hence, f is irreducible in Z[x] as well. 


Example 5.8 (Eisenstein’s Criterion) Assume that f € Z[x] is monic and every 
coefficient of f except the leading coefficient is divisible by a prime p € N. Let 
f(x) = g(x)h(%) in Zh]. Since reduction modulo p leads to [f],(x) = x", we 
conclude that [g],(x) = x*, [A],(x) = x” for some k, m. This means that all but 
the leading coefficients of g, h are divisible by p as well. Hence, if both g and h are 
of positive degree, then the constant term of f must be divisible by p?. We conclude 
that a monic polynomial f € Z|x] is irreducible if every coefficient of f except the 
leading coefficient is divisible by p and the constant term is not divisible by p?. This 
observation is known as Eisenstein’s criterion. 


Example 5.9 (Cyclotomic Polynomial ©,) Eisenstein’s criterion allows us to see 
easily that for every prime p € N, the cyclotomic polynomial!” 


x? — 1 


O,() =P $$ pet lS 
i 


is irreducible in Z[x]. Just pass to the new variable t = x — 1. Then 


f= aye =P era (Merge p 
t 1 p-1 


satisfies Eisenstein’s criterion, because for 1 < k < p — 1, the binomial coefficients 
(?) are divisible by p, and the last of them, equal to p, is not divisible by p’. 


5.5.2 Kronecker’s Algorithm 


Kronecker’s algorithm allows us to check whether f € Z[x] is irreducible, and if 
it is not, to produce an irreducible factorization of f in Z[x] after a finite but quite 
laborious computation. Let degf = 2n or degf = 2n + 1 for some n € N. If there 


See Sect. 3.5.4 on p. 60. 
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exists a nontrivial factorization f = gh in Z[x], we can assume that degh < n. 
Choose n+ 1 different integers zg, z},...,Z, € Z, then list all collections of integers 
d = (do,d,, ..., d,) such that d; divides f(z;), and for each collection, form the 
unique polynomial” hy € Q[x] such that hg(z;) = d; for all i: 


ha(x) = sa T] oe) (5.15) 


=0 it (zi — zy) 


Clearly, the polynomial h, if it exists, is equal to one of the hy that has integer 
coefficients. Therefore, to check whether f has a divisor of degree < n, it is enough 
to examine whether f is divisible by some hg € Zj|x] from the finite list just 
constructed. 


Problems for Independent Solution to Chap. 5 


Problem 5.1 Find all positive integers n divisible by 30 and having exactly 30 
positive integer divisors including | and n itself. 


Problem 5.2 Find all rational roots of the polynomial 2x* — 7x3 + 4x7 — 2x — 3. 


Problem 5.3 Show that for every field k, the ring k[x] is Euclidean with the degree 
function v taking a power series f to the degree of the lowest term in f. 


Problem 5.4 Assume that +1 are the only common divisors of polynomials f, g € 
Z{x]. Can the quotient ring Z|[x]/(f, g) be infinite? 

Problem 5.5 (Sums and Products of Ideals) Show that for ideals J, J in a 
commutative ring, the intersection 1M J, product IJ © {x,y, + xry2 +++ + XnYn | 
xi €1,y, €J,n€ N}, and sumI1+J£ {x+y|x eI, y € J} are ideals as well. 
Prove that JJ C 1M J and give an example of JJ A 1 J. 

Problem 5.6 (Radical of an Ideal) Let K be an arbitrary commutative ring with 
unit. Given an ideal J C K, show that its radical JI@ {ae K|ineN:a" el} 
is also an ideal and check that /7J = /7 J for all ideals I,J CK. 

Problem 5.7 (Coprime Ideals) Ideals /, J of an arbitrary commutative ring K with 
unit are called coprime if 1 + J = K, i.e., there exist x € J, y € J such that 
x+y = 1. Show that: (a) J = 1M J for all coprime J, J, (b) if J is coprime to 
every ideal in the collection J,, J2,...,J,, then J and ‘ar J, are coprime. 

Problem 5.8 (Chinese Remainder Theorem) Prove that for every commutative 
ring K with unit and every collection of mutually coprime ideals a,, a2,..., dim C 
K, we have ay - dg +--+ dy = 4) Na.NM --- Nay, and construct an isomorphism 


K/ay + a2 +++ Gm = (K/a) x (K/a2) X +++ X (K/dm). 


0See Exercise 3.10 on p. 50. 


5.5 Factorization of Polynomials with Rational Coefficients 121 


Problem 5.9 Given a homomorphism of commutative rings gp : K — L anda 
polynomial apx" + a,x""! +--+ a,_1x+a, € Kx], write f? for the polynomial 
Q(Gn)x" + P(dn—1)x"! + +++ + G(ao) € L{x]. Verify that the map 


9: Kk] > Lb], frf?, 


is aring homomorphism. Assume that both rings K, L are integral domains. Show 
that if f? is irreducible in Q;|x] and degf? = degf, then f is irreducible in K [x]. 

Problem 5.10 Determine whether the polynomial (a) x* — 8x° + 12x? — 6x 4+ 2, 
(b) x° — 12x3 + 36x — 12, is reducible in Q{y]. 

Problem 5.11 Determine whether the following polynomials are irreducible in Z[x] 
and find the irreducible factorizations of those that are reducible: (a) x* + x + 1, 
(b) x + x4 +27 +x 4-2, (ec) x® +9 + 1, (d) x! — 9. 

Problem 5.12 For an arbitrary collection of distinct integers aj,...,a, € Z, 
determine whether the given polynomial is irreducible in Q|x]: 

(a) (x — a1)(x— aa) +++ (x — ay) — 1, (b) (® — 1)? ++ (& — ay)? +1. 

Problem 5.13 Without using Zorn’s lemma, show that every proper ideal of a 
Noetherian ring is contained in some maximal ideal. 

Problem 5.14 List all ideals in the ring k[t], where k is a field, and describe all the 
maximal ideals among them. 

Problem 5.15 Find a nonprime irreducible element in the ring Z[V 13]. 

Problem 5.16 Let k be a field and K 5 k an integral domain. Given € € K, write 
eve : k[x] > K for the evaluation map, which sends f(x) = aox"+ayx""! +---+ 
Gn—1Xt+a, € kx] to f(E) = ap&" + ay€"! +--+ ay_1& + ay € K. Show that 
(a) im(ev¢) is the minimal subring in K containing k and £, (b) im(ev¢) is a field 
if and only if kerevg 4 0. 

Problem 5.17 Among the quotient rings of ZJi], is there a field of characteristic 
(a) 2? (b) 3? If such a field exists, what can its cardinality be equal to? 

Problem 5.18 Let p),p2,...,, be prime ideals and / an arbitrary ideal such that 
IC U, pe. Prove that J C p,; for some k. 

Problem 5.19 Let K be a commutative ring with unit, S C K a multiplicative set, 
I C K an ideal such that 7M S = @. Use Zorn’s lemma to show that there exists 
an ideal p C K maximal among ideals J C K with J > Jand JMS = @, and 
every such p is prime. 

Problem 5.20* (Nilradical). In a commutative ring K with unit, the radical of the 
zero ideal”! is called the nilradical of K and is denoted by 


n(K) = /(0) = {a € K | a" = 0 for some n € N}. 


Show that n(K) coincides with the intersection of all proper prime ideals p C K. 


*1Compare with Problem 5.6. 
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Problem 5.21* (Krull’s Criterion). Show that an integral domain K is a unique 
factorization domain if and only if for every nonzero element a € K, every 
minimal prime ideal” p 5 a is principal. 


Problem 5.22 Let an integral domain K be equipped with a function 
bw: K~ {0} > Zso 
such that Va,b € KX0, dq,r € K:a=bq+randeither v(r) < v(b) orr = 0. 


Show that the function v : K ~ {0} > Zso defined by v(a) @ mingex~o((ab)) 
is a Euclidean degree** on K. 


2That is, a prime ideal that does not contain prime ideals q C p except for q = p and q = (0). 
3See Definition 5.4 on p. 109. 


Chapter 6 
Vectors 


In this section we continue to use the notation of Chaps. 3-5 and write K for an 
arbitrary commutative ring with unit and k for an arbitrary field. 


6.1 Vector Spaces and Modules 


6.1.1 Definitions and Examples 


We begin with the definition of a vector space. It formalizes the algebraic properties 
of geometric vectors, namely the addition of vectors and the multiplication of 
vectors by constants. Although there is quite a variety of vector spaces—field 
extensions, function spaces, spaces of solutions of systems of linear equations, even 
spaces of subsets—it is useful to think of vectors as arrows considered up to a 
translation. 


Definition 6.1 (Vector Space over k) An additive abelian group V is called a 
vector space over a field k if it is equipped with an operation 


kxVoV, (Q,v)rh Av, 


called multiplication of vectors by scalars and possessing the following properties: 


VA,wekVveVv A(wv) = Ap)v, (6.1) 
ViA,wekVuvev A+ pv =Av+ypvr, (6.2) 
VusweVVAEk A(vu+w) =Av+dAw, (6.3) 

VueV l-v=v. (6.4) 
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The elements of the field k are called scalars, and k itself is referred to as the ground 
field. The elements of V are called vectors. The additive group operation Vx V > V 
is called addition of vectors. The neutral element 0 € V is called the zero vector. 
A subset U C Vis called a subspace of V if it is itself a vector space with respect 
to the operations on V. 


Definition 6.2 (K-Module) If the field k in Definition 6.1 is replaced by an 
arbitrary commutative ring K and the property (6.4) is excluded, then an additive 
abelian group V equipped with multiplication by scalars K x V — V satisfying the 
remaining conditions (6.1)-(6.3) is called a module over K or a a K-module for 
short. If the ring K has a unit element and the property (6.4) is satisfied as well, then 
the K-module V is called unital. 


Therefore, vector spaces are particular examples of unital modules. 


Exercise 6.1 Deduce from (6.1)—(6.3) that 0-v = 0 and 4-0 = 0 forall v € V and 
all A € K ina K-module V. Show that for every unital module over a commutative 
ring with unit, (—1)-v = —v forall uv € V. 


Agreement 6.1 Sometimes, it is more convenient to write vA instead of Av for the 
product of a vector v € V and scalar A € K. By definition, we put vA = Av ina 
module V over a commutative ring K. 


Example 6.1 (Zero Module) The simplest module is the zero module 0, which 
consists of the zero vector only. The zero vector is opposite to itself, and A-0 = 0 
for allA € K. 


Example 6.2 (Free Module of Rank \ and Its Submodules) Every ring of scalars K 
is a K-module with respect to addition and multiplication in K. Such a module is 
called a free module of rank 1. The submodules of K are precisely the ideals J C K. 


Example 6.3 (Abelian Groups) Every additive abelian group A has the natural 
structure of a Z-module, in which multiplication by scalars is defined by 


ma = sgn(m)-(a+a+--- +a). 


|m| times 


The submodules of A coincide with the additive subgroups of A. 


6.1.2 Linear Maps 


Every map of K-modules F : U — W that respects addition and multiplication by 
scalars, that is, a map that satisfies for all a,b € U and alla, 6 € K the condition 


F(aa+ Bb) = aF(a) + BF(b), 
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is called a homomorphism of K-modules or, more frequently, a linear map.'! We 
met linear maps Q|x] > Q[X] earlier in Sect. 4.4 on p.88, when we studied the 
operators ®(D) on the space of polynomials. 

Two K-modules U, W are said to be isomorphic if there is a linear bijection 
go: U = W. Such a bijection is called an isomorphism of K-modules. 

Since every linear map F : U — W is a homomorphism of abelian groups, it 
possesses all the properties of such homomorphisms.” In particular, F(0) = 0 and 
F(—v) = —F(v) for all v. The image imF = F(V) C W is a submodule of W, 
because Ag(u) = g(Au) for all u € U. The kernel 


ker F = F7'(0) = {ue U| F(u) = 0} 


is a submodule in U, because g(u) = 0 forces gp(Au) = Ag(u) = 0. By 
Proposition 2.1 on p. 32, every nonempty fiber of F is a parallel shift of the kernel by 
any element of the fiber: F~'(F(u)) = u + ker F = {u + v | F(v) = 0}. Therefore, 
a linear map F is injective if and only if ker F = 0. 


Caution 6.1 Given a,b € K, the map g : K —> K, g(x) = a-x +b, which often 
is called a “linear function” in calculus, is linear in the sense of the above definition 
only for b = 0. If b 0, then g(Ax) 4 Ag(x) and g@ + y) ¥ G(x) + GG) for 
most x, y, A. Thus in algebra, “linear” means what is called “linear homogeneous” 
in calculus. 


6.1.3 Proportional Vectors 


Let V be a vector space over a field k. Vectors a,b € V are called proportional’ if 
xa = yb for some x, y € k such that xy 4 0. Thus, the zero vector is proportional to 
every vector, whereas the proportionality of nonzero vectors a, b means that there 
exists a nonzero constant A € k such that a = Ab, or equivalently, b = A~!a. 


Example 6.4 (Coordinate Plane) The simplest example of a nonzero vector space 
different from k is the coordinate plane k? = k x k. By definition, it consists of 
ordered pairs of numbers arranged in columns of height two: 


X 
v= } , xX1,%2 Ek. 
X2 


‘Or K-linear, if the precise reference on the ring of scalars is important. 
>See Sect. 2.6 on p. 31. 
3 Also collinear or linearly related. 


126 6 Vectors 
Addition and multiplication by scalars are defined componentwise: 
a) bi\ ae [Aa + ) 
Xr + = ; 
(‘:) . (a) es + pbs 


Vectors a = (4!) andb = () are proportional if and only if ajby = azb;. The 
difference 


det(a, b) e a,b = arb, 


is called the determinant of the vectors a, b € k?. It is clear that 


det(a, b) = 0 <=> a and bare proportional, (6.5) 
det(a,b) = —det(b,a) Va,bek’, (6.6) 
det(Aa, b) = Adet(a,b) = det(a,Ab) Va,bek?, ek, (6.7) 
det(a, + a,b) = det(ay, b) + det(ap, b), (6.8) 
det(a, b} + by) = det(a, b,) + det(a,b2) V a,ay, a,b, by, by € k’. (6.9) 


Properties (6.6), (6.7), and (6.9) are referred to as skew symmetry, homogeneity, 
and additivity respectively. Homogeneity and additivity together mean that the 
determinant is linear in each argument when the other is fixed, i.e., for all a,b € V, 
the functions 


det(x,b): V>k, vt det(v,bd) 
(6.10) 
det(a,*):V—>k, vt> det(a,v) 


are both linear. Such a combination of properties is called bilinearity. An equivalent 
reformulation of bilinearity is the following distributivity law: 


det (aa+ Bb, yc+ dd) 
= ay det(a,c) + a6 det(a, d) + By det(b, c) + B65 det(b, d) (6.11) 


for all a,b,c,d € k? anda, B, y,6 € k. 


Example 6.5 (Cramer’s Rule in the Coordinate Plane) Any two nonproportional 
vectors a, b € k* form a basis of the coordinate plane in the sense that every vector 
v € k* admits a unique representation 


v=x-a+t+y-b, wherex, yek. (6.12) 
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Indeed, given such an expression, then in virtue of the relations det(a,a) = 
det(b, b) = 0, the evaluation of liner functions (6.10) on the both sides of (6.12) 
leads to 
det(v, b) = det(x-a+y-b,b) =x-det(a,b) + y- det(b, b) = x- det(a, bd), 
det(a, v) = det(a,x-a+y-b) =x-det(a,a) + y- det(a, b) = y- det(a, b). 


Hence, the coefficients x, y in (6.12) are uniquely determined by a, b, v as 


x = det(v, b)/ det(a, b), 


(6.13) 
y = det(a, v)/ det(a, b). 


These formulas are known as Cramer’s rule. To verify that for every v € k?, the 
identity 
det(v, b) det(a, v) 
Vv = ee a —— - 
det(a, b) det(a, b) 


actually holds, note that the difference v — det(v, b) - a/ det(a, b) is proportional to 
b, because 


det(v, b) 
det {| v - ———_.. 
det(a, b) 


det(v, b) 


oe. deity = 0; 
ETP an oa, 


a,b) = det(v, b) — 


Therefore, v = det(v, b)-a/ det(a, b) + 4-b for some A € k. As we have just seen, 
this forces A = det(a, v)/ det(a, b). 


6.2 Bases and Dimension 


6.2.1 Linear Combinations 


Let V be an arbitrary module over a commutative ring K with unit. A finite 
expression of the form Ayw, + A2w2 +++++AmWm, Where w; € V, A; € K, is called 
a linear combination of vectors w,,W2,..., Wm With coefficients A,,A2,...,Am. A 
linear combination is nontrivial if not all the A; are equal to zero. We say that a set 
of vectors [ C V spans* V if every vector in V is a linear combination of some 
vectors W1,W2,...,Wm € I. A K-module is finitely generated if it is spanned by a 
finite set of vectors. A set E C V is called a basis of V if E generates V and each 
vector v € V has a unique expansion as a linear combination of vectors from E. 


4Or linearly generates. 
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Here uniqueness means that an equality between two linear combinations of basic 
vectors 


Xe, + X2€2 + +++ + Xn€n = yey + yre2 +++ + Yn€n, Where e; CE, xi,yi € K, 


forces x; = y; for all i. If a K-module V admits a basis E C V, then V is called a 
free K-module. By definition, every vector v in a free module V with basis E can 
be uniquely written as a linear combination v = )°,<, xe in which all but finitely 
many coefficients x.(v) vanish. The coefficients x. = x,(v) of this expression are 
called the coordinates of v in the basis E. 


Caution 6.2, When K is not a field, a K-module will not be free in general. For 
example, the additive abelian group Z/(n), considered as a Z-module,> is generated 
by the class [1],, because every class [m], = m- [1], is proportional to [1], with 
coefficient m € Z. However, Z/(n) does not admit a basis over Z, since for every 
vector [m],, two different multiples A -[m], = [Am], and y-[m],, = [um], are equal 
in Z/(n) as soonas A = x + kn in Z. 


Example 6.6 (Coordinate Module K") A coordinate module K” over an arbitrary 
commutative ring K with unit is defined in the same way as the coordinate plane 
from Example 6.4. By definition, a vector of K” is an ordered collection of n 
elements® of K arranged in either columns or rows’ of size n: 


(%1,%2,....%n), MEK. 


Addition of vectors and multiplication of vectors by scalars are defined componen- 
twise: 


def 


(%1,%2,.-- Xn) + O15 925+-+ Yn) = (1 +1, X2 + Yas «+ Xn + Yn) 
A+ (x1,X2,-265%n) Sf (Aq, Aa... Ayn. 
The standard basis of K" consists of the vectors 
e; = (0,...,0,1,0,...,0), Il<i<n, (6.14) 


where all but the ith coordinates vanish. Clearly, 


(X1,X0,---,Xn) = Xe, + X2e2 +--+ + Men 


>See Example 6.3 on p. 124. 

6More scientifically, we could say that K” is a direct product of n copies of the abelian group K 
equipped with componentwise multiplication by elements A € K. 

7To save space, we shall usually write them in rows. However, when the column notation becomes 
more convenient, we shall use it as well. 
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is the unique expression of the vector v = (4],%2,...,%,) © K” as a linear 
combination of e), @2,...,@,. Thus, K” is free and admits a basis of cardinality n. 


Example 6.7 (Matrices) Anim xn matrix over K is a rectangular array with m rows 
and n columns, 


Q\1 G12... Gin 


421 422... Gdn 


A = (ay) = a eee 
Am Am2 +++ Amn 


filled by scalars aj € K. We write Mat,,.x,(K) for the K-module of all mxn matrices 
over K. Addition of matrices and multiplication of matrices by scalars are defined 
componentwise as follows: the (i, /) entry of the linear combination AA + wB equals 
Aay + bj, where aj, bj are the (i,j) entries of the matrices A = (aj), B = (bj). 
For example, in Mat»3(Z), we have 


2. (5 0 3G i = (eS) 

2-1 3 3.05 5-2-9) © 

In particular, the coordinate module K” can be thought of either as a module of 
one-row matrices Mat) x,(K) or as a module of one-column matrices Mat, )(K). 
We write Ej; for a matrix that has 1 in the (i,) cell and 0 in all the other cells. An 
arbitrary matrix A = (ai) is uniquely expanded in terms of the Ej as A = )° i UE. 
Thus, the mn matrices Ej form a basis of the module Mat,,x,(K). They are called 


the standard basis matrices. Thus, the module Mat,,x,(K) is free and admits a basis 
of cardinality mn. 


Example 6.8 (Module of Functions) For a set X, the ring K* of all functions 
jf : X — K can be considered a K-module with respect to the standard pointwise 
addition and multiplication by constants 


fith: xe fio) +f) and Af: xr Af(x). 


For the finite set X¥ = {1, 2, ... ,m}, there exists an isomorphism of K-modules 
K* = K" sending a function f : X — K to the collection of its values 
( fQ), fQ), ..., f(n)) € K". The inverse isomorphism sends the standard basic 
vector e; € K” to the 6-function 6; : X — K defined by 


1 eR S9 
Pikes Toe (6.15) 
0 ifk4i. 


Example 6.9 (Space of Subsets) Let K = F2 = {0, 1} be the field of two elements. 
For a set X, write F¥ for the vector space of all functions y : X — F considered 
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in Example 6.8 above, and write S(X) for the set of all subsets of X. There is a 
natural bijection S(X) + X™? taking a subset Z C X to its characteristic function 
Xz: X — F» defined by 


1 forxEZ 


Kats) 0 forx €Z. 
The inverse map takes a function yz : X — F>2 to its support Supp(y) = {x € X | 
x(x) # 0}. The vector space structure on F¥ can be transferred to S(X) by means of 
this bijection. Then the addition of functions becomes the symmetric difference of 
subsets Z; /\ Zo #7,UZ)~Z, NZ. Multiplication of functions by 0 and | becomes 
0:Z£@ and 1-Z © Z respectively. 


Exercise 6.2 By a direct verification of the axioms, check that the operations just 
defined provide S(X) with the structure of a vector space over F >. For the finite set 
X= {1,2,... ,n}, give a basis of S(X) over Fo. 


Example 6.10 (Polynomials and Power Series) The polynomial ring K[x] can be 
considered a K-module with respect to the usual addition of polynomials and 
multiplication of polynomials by constants. By the definition of K[x], the countable 
set of the monomials 1, x, x”, ... is a basis of K [x] over K. The polynomials of degree 
at most n form a submodule K[x]<, C K[x], and the monomials 1, x, x7, ... ,x" 


form a basis of K [x] <n. 


Exercise 6.3 Show that every collection of monic polynomials fo, fi,...,f, such 
that deg f,, = k for each k is a basis in K[x]<p. 


The formal power series with coefficients in K also clearly form a K-module. 
However, in contrast with K[x], the monomials x”, m > 0, do not form a basis of 
K |x], because a series with infinitely many nonzero coefficients cannot be written 
as a finite linear combination of monomials. To find a basis in K[x], we have to use 
the axiom of choice and transfinite machinery.® 


Exercise 6.4 Show that the vector space Q]x] over Q does not admit a countable 
basis. 


6.2.2 Linear Dependence 


A set of vectors S in a K-module V is said to be linearly independent if every 
nontrivial linear combination of the vectors in S does not vanish, i.e., if every 


8See Sect. 6.2.4 on p. 134. 


6.2 Bases and Dimension 131 


relation 
Ayvy +Agv2 +++: +AmUm = 0, where v; EV, A, EK, (6.16) 


forces all the A; to be zero. Conversely, a set S is linearly dependent or linearly 
related if some nontrivial linear combination of vectors in S equals zero, i.e., the 
equality (6.16) holds for some v; € S and some A; € K not all equal to zero. Such an 
equality (as well as the nontrivial linear combination of vectors v; on the left-hand 
side) is called a linear relation among V1, V2,..., Um. Note that every set of vectors 
containing the zero vector is linearly related.? 

A linear expression Vj, = [4101 + [2V2 +++++ fm—1Um—1 Can be rewritten as the 
linear relation 


Hivy + [20V2 Se Lm—-1U¥m—1 — Um = 0. 


Thus, if some vector in S can be expressed as a linear combination of some other 
vectors in S, then S is linearly related. For a vector space V over a field k, the 
converse is also true: if vectors v1, U2,...,Um € V are linearly related, then one of 
them is a linear combination of the others. Indeed, the linear relation (6.16) allows 
us to express any v; such that A; ~ 0 in terms of the others. For example, if A,, 4 0 
in (6.16), then 


Ay Ar Am—1 
Um = -—>77 U1 — 7 2 — 8 — 2 1 - 


Xn Xm Am 


In particular, two vectors in a vector space are linearly related if and only if they are 
proportional. !° 

For modules over an arbitrary ring K, the existence of a linear relation (6.16) does 
not permit us, in general, to express one of the vectors v; as a linear combination of 
the others. For example, let V = K = Q[x, y]. The polynomials x and y, considered 
as vectors in V, are linearly related over K, because Ax — wy = OforA = y, w =x. 
However, there is no f € K such that x = fy or y = fx. 


Lemma 6.1 Assume that a K-module V is generated by a set E C V. Then E is a 
basis of V if and only if E is linearly independent. 


Proof If} Aj;e; = 0 for some e; € E and some A; € K not all equal to zero, then the 
zero vector has two different linear expansions 0 = )° x;e; = }-0- e;. Conversely, 
if }° x;e; = >- yie;, where x; # y; for some i, then }>(x; — y;)e; = 0 is a nontrivial 
linear relation among the e;. Oo 


Since if v; = 0, for instance, then we can take A; = 1 and all the other A; = 0. 
'0See Sect. 6.1.3 on p. 125. 
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6.2.3 Basis of a Vector Space 


At this point, we break the discussion of generic K-modules until Chap. 14 and 
concentrate on the case of vector spaces V over an arbitrary field k. 


Lemma 6.2 (Exchange Lemma) Assume that the vectors w,,W2,...,Wm generate 
the vector space V and that the vectors uy, u2,..., Ux € V are linearly independent. 
Then m = k, and after appropriate renumbering of the vectors w; and replacing the 
Jirst k of them by uy, u2,..., Uz, the resulting collection 


U,,U2,...,Uk, Wk+1, Wk+2, +--+. ,Wm 


will generate V as well. 


Proof Let uy = xjwy + x2W2 + +++ + XmWm. Since the vectors u; are linearly 
independent, we have u; # 0, and therefore, not all the x; equal zero. Renumber 
the w; in order to have x; 4 0. Then 


1 x2 Xm 
Wy = —uUy—- —WwW2-°*': -——Wn- 
XxX] X1 X| 
Therefore, u;, W2, W3, ... , Wm Span V. Now proceed by induction. Assume that for 
some i in the range | <i <k, the vectors uj, ... , Uj, Wi+1, -.- ;Wm Span V. Then 


Uj) = Yiu + youg +++ + yile + Xi Wid + Xi42Wi42 +s + XmWm- (6.17) 


Since the vectors u; are linearly independent, the vector u;;, cannot be expressed 
as a linear combination of just the vectors uy, u2,..., uj, i.e., there is some x;4; 4 0 
in (6.17). Therefore, m > i, and we can renumber the vectors w; in (6.17) in order 
to have x;+, 4 0. Then the vector w;+1 can be expressed as a linear combination of 
the vectors u), u2,...,Uj+1 Wi+2, Wi+3, --- ,Wm- Hence, this set of vectors span V. 
This completes the inductive step of the proof. Oo 


Exercise 6.5 Show that a vector space V is finitely generated if and only if the 
cardinalities of all the finite linearly independent subsets of V are bounded above. 


Theorem 6.1 Jn every vector space V over a field k, every generating set of vectors 
contains a basis of V, and every linearly independent set of vectors can be extended 
to a basis of V. Moreover, all bases of V have the same cardinality. 


Proof We prove the theorem under the additional assumption that V is finitely 
generated. In the general case, the proof is similar but uses some transfinite 
arguments. We sketch those arguments in Sect. 6.2.4 below. 

By Lemma 6.1, the bases of V are exactly the linearly independent generating 
sets of vectors. By Lemma 6.2, the cardinality of a linearly independent set is less 
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than or equal to the cardinality of any generating set. Therefore, all the bases have 
the same cardinality. 

Now let the vectors v1, V2,..., Um span V. If they are linearly related, then one of 
them is a linear combination of the others. If we remove this vector, the remaining 
set will generate V as well. After a finite number of such removals, we will end up 
with a linearly independent generating set, that is, a basis of V. 

Finally, let the vectors e1, e2,..., ex be linearly independent. If they do not span 
V, then there exists some v € V that cannot be written as a linear combination 
of @1,é@2,...,ex. If we add such a v to e1,é2,...,ex, the resulting collection of 
k + 1 vectors will also be linearly independent. Since the cardinalities of linearly 
independent sets of vectors in a finitely generated vector space are bounded above 
by Exercise 6.5, after a finite number of such extensions we will have a linearly 
independent generating set. Oo 


Definition 6.3 Let V be a vector space over any field k. The cardinality of any 
basis in V is called a dimension of V and is denoted dim V or dim, V, if the precise 
reference to the ground field is required. Finitely generated vector space is called 
finite-dimensional. 


Corollary 6.1 In an n-dimensional vector space V, the following properties of a 
collection of n vectors are equivalent: 


(1) They are linearly independent. 
(2) They span V. 
(3) They form a basis of V. 


Proof Let e),é2,...,@, be a basis in V and vj, v2,...,U, the given vectors. If 
they are linearly independent, then by Lemma 6.2, replacement of all vectors e; 
by vectors v; leads to a generating collection of vectors. Hence, (1) => (2). If the 
vectors v; generate V, then by Theorem 6.1, some of them form a basis of V. Since 
the cardinality of this basis equals n, we conclude that (2) > (3). The implication 
(3) > (1) was a part of Lemma 6.1. oO 


Corollary 6.2. Every n-dimensional vector space V over a field k is isomorphic 
to the coordinate space k". The isomorphisms F : k" = V are in one-to-one 
correspondence with the bases of V. 


Proof For every isomorphism F : k” = V, the images of the standard basis vectors!! 


e; € k" form a basis in V. Conversely, for any basis v;, v2,..., U, in V, the map 
Fuk”? SV, (x1,%2,...,Xn) > XyV_ + X0V2 Fees + XnUn, 
is linear, bijective, and takes e; € k” to v; € k”. oO 


See formula (6.14) on p. 128. 
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Corollary 6.3. Every n-dimensional vector space V over a finite field of cardinality 
q consists of q” vectors. Oo 


Example 6.11 (Finite Field Extensions) Let k Cc F be an extension of fields. Then 
F is a vector space over k. If it has finite dimension dim, F = d, then the extension 
k C F is called a finite extension of degree d. In this case, every element ¢ € F is 
algebraic over k, because an infinite set of powers C* is linearly related, and every 
linear relation among the powers ¢* represents a polynomial equation in C. 

In particular, every finite field F of characteristic p is a finite extension of the 
prime subfield F,, C F. Then |F| = p%, where d = dimr, F, by Corollary 6.3. This 
is a simple conceptual proof of Corollary 3.4 on p. 64. 


Exercise 6.6 Show that F,» can be a subfield of Fp» only if n | m. 


6.2.4 Infinite-Dimensional Vector Spaces 


Without the assumption that V is finitely generated, the proof of Theorem 6.1 has 
to be modified in the spirit of Sect. 1.4.3 on p. 15 as follows. For any two sets of 
vectors J C S such that J is linearly independent, write Z,(S) for the set of all 
linearly independent sets J such that J C I C S. The set Z;(S) is partially ordered 
by inclusion and complete,'? because every chain J, € Z;(S) has the upper bound 


f= 
Exercise 6.7 Check that / is linearly independent. 


Hence, thanks to Zorn’s lemma, Lemma 1.3 on p. 16, every linearly independent 
set J C S is contained in some maximal linearly independent set E C S. Since for 
any vector s € S~ E, the set E U {s} is linearly dependent, there exists a finite linear 
combination 


As + Aye; +Aze2 +°++ + Amem = 0, where A # 0 and e; € E. 


Therefore, all vectors in S can be expressed as a linear combination of vectors in 
E. In particular, if S generates V, then F is a basis in V. Hence, for any two sets 
of vectors J C S such that J is linearly independent and S spans V, there exists a 
basis E of V extending J and contained in S. This proves the first two statements of 
Theorem 6.1. 


'2See Definition 1.2 on p. 16. 
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That every base in V has the same cardinality of every other basis follows from 
the Cantor-Schroder—Bernstein theorem! and the following transfinite extension of 
Lemma 6.2: 


Exercise 6.8 Given two sets of vectors S,/ C V such that S spans V and / is linearly 
independent, show that there exists a subset R C S of the same cardinality as J such 
that / U (S ~ R) spans V as well. 


6.3 Space of Linear Maps 


For every pair of vector spaces U, W, the linear maps'* F : U > W forma vector 
space in which addition and multiplication by constants are defined by 


F+G:ub F(u) + Glu) 
AF :ut>AF(u). 


The space of linear maps U — W is denoted by Hom(U, W), or by Hom, (U, W) 
when we need a precise reference to the ground field. 


6.3.1 Kernel and Image 


Proposition 6.1 Let F : V +> W be a linear map and K,L C V two sets of vectors 
such that K is a basis in ker F and K UL is a basis in V. Then all vectors w. = F(e), 
e € L, are distinct, and they form a basis in im F. For finite-dimensional V, this 
leads to the equality 


dimker F + dimimF = dimV. (6.18) 


Proof Since F sends an arbitrary vector v = )° pen Xf + deer Yee € V to 


F(v) =) x F(f) + D_yeF (0) = D_ yewe, 


fEK ecL eeL 


'3]t asserts that if there are injective maps of sets A <> B and B <> A, then there is a bijection 
A>B. 
'4See Sect. 6.1.2 on p. 124. 
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it follows that the vectors w. span im F. If there is a linear relation 


0= Yo Aewe = F()> Ae), 


eel eel 


then )>,<, Ace € ker F is forced to be a linear combination of vectors f € K. Since 
LU K isa linearly independent set, all the A, are equal to zero. oO 


Corollary 6.4 The following properties of a linear endomorphism F : V — V of a 
finite-dimensional vector space V are equivalent: 


(1) F is bijective. 
(2) ker F=0. 
(3) imF = V. 


Proof Properties (2) and (3) are equivalent by the relation (6.18). Since (2) means 
the injectivity of F, property (1) is equivalent to the simultaneous fulfillment of (2) 
and (3). oO 


6.3.2 Matrix of a Linear Map 


Let w1,uU2,...,Un and wi, W2,...,Wm be bases in vector spaces U and W respec- 
tively. Then every linear map F : U — W can be completely described by means of 
some m X n matrix as follows. Expand the images of the basis vectors in U through 
the basis vectors in W as 


m 


F(uj) = wi-fi (6.19) 


i=1 


and write the coefficients fj, € k as an m x n matrix whose jth column consists of 
the m coordinates of the vector F(u;): 


fi fiz --+ fin 

fu fro «-» fon 
~ dP = (Fu), F(a), «-. » Fun) € Matnxn(k) (6.20) 

mi Fro tee Finn 
This matrix is called the matrix of F in the bases u = (uUj,U2,...,Un),W = 
(W1,W2,...,Wm). It depends on the choice of both bases and is denoted by Fy, = 


(fj) for short. 


Exercise 6.9 Verify that addition of linear maps and multiplication of linear maps 
by constants corresponds to the addition of their matrices and the multiplication of 
the matrices by constants as defined in Example 6.7 on p. 129. 
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Proposition 6.2 For every choice of bases u = (uj,U2,...,Un) in U andw = 
(W1,W2,---,Wm) in W, the map 
Hom, (U, W) > Matnxn(k), FR Fyn, (6.21) 


is a linear isomorphism of vector spaces. In particular, dimHom(U, W) = dim U - 
dim W. 


Proof The map (6.21) is linear by Exercise 6.9 and injective, because every linear 
map F : U — W is completely determined by its action on the basis vectors: as 
soon the matrix of F is given, every vector of the form v = )~ u;x; will be sent to 


n n m 


F(v) = F()- uj) = > F(uj) xXpE = » y Wi + fijXj : (6.22) 
j=l 


j=l j=1 i=1 

On the other hand, for every matrix ( fi) € Mat,,x,(k), the assignment (6.22) defines 
a linear map F : U — W whose matrix in the bases uw, w is exactly the matrix ( fi): 
Exercise 6.10 Check this. 

Therefore, the map (6.21) is surjective. oO 


Example 6.12 (Qualitative Theory of Systems of Linear Equation) A system of 
linear equations 


QyyX1 + Ay2X2 + +++ + AynXy, = Dy, 

A71X1 + A22X2 + +++ + danX_ = bo, 

A31X1 + €32X2 + +++ + A3nXn = 53, (6.23) 
Ami X1 + Am2X2 + +++ + AmnXn = bn, 


implies that the linear map A : k"” —> k” whose matrix in the standard bases! of the 
spaces k", k’” is (aj) sends the unknown vector x = (x;,%2,...,%n) € k” to a given 
vector b = (bi, b2,...,bm) € k’”. In other words, the system (6.23) is an expanded 
coordinate representation for the single linear matrix—vector equation A(x) = b in 
the unknown vector x € k”. Here b € k” is a given vector and A : k” > k” 
is a given linear map. We conclude that for b ¢ imA, the system (6.23) has no 
solutions, whereas for b € imA, the solutions of (6.23) form a parallel copy of the 
vector subspace kerA C k” shifted by any vector v € A~!(b) C k". In other words, 
the difference x — x’ of any two solutions x, x’ of the system (6.23) is a solution 
of the system of linear homogeneous equations obtained from (6.23) by taking all 


'SSee Example 6.6 on p. 128. 
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b; = 0. The following two corollaries follow immediately from Proposition 6.1 and 
Corollary 6.4. 


Corollary 6.5 [fall b; in (6.23) are equal to zero, then the solutions x form a vector 
space of dimension at least n — m. In particular, for n > m, such a system has 
nonzero solutions. Oo 


Corollary 6.6 (Fredholm Alternative) Jf m = n in (6.23), then either the 
equation A(x) = b has a unique solution x for every b € k" or the homogeneous 
equation A(x) = 0 has a nonzero solution x # 0. Oo 


6.4 Vector Subspaces 


6.4.1 Codimension 


It follows from Theorem 6.1 on p. 132 that every basis of a subspace U C V can 
be extended to a basis of V. In particular, every subspace U in a finite-dimensional 
vector space V is finite-dimensional, too, and dim U < dimV. The difference of 
dimensions 


codimy U“ dim V — dimU 


is called the codimension of the vector subspace U C V. For example, Proposi- 
tion 6.1 on p. 135 says that for every linear map F’,, the codimension of ker F equals 
the dimension of im F. 


Example 6.13 (Hyperplanes) Vector subspaces of codimension | in V are called 
hyperplanes. For example, the kernel of every nonzero linear map € : V > k is 
a hyperplane in V. Indeed, since imé C k is nonzero, it is at least 1-dimensional 
and therefore coincides with k. Hence, dimker& = dim V—dimimé = dimV—1. 
For example, given some a € k, the polynomials f € k[x] such that degf < n and 
f(a) = 0 form a hyperplane in the vector space k[x]<, of all polynomials of degree 
at most n, because they form the kernel of the evaluation map ev, : k[x]<, — k, 
f +> f(@), which is linear in f. 


Exercise 6.11 Show that every hyperplane W C V is a kernel of some nonzero 
linear map € : V — k determined by W uniquely up to proportionality. 
6.4.2 Linear Spans 


The intersection of any set of vector subspaces in V is a subspace as well. Hence, 
for every set of vectors M C V, there exists the smallest vector subspace containing 
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M. It is called the linear span of M and is denoted by 


span(M) = () U. (6.24) 
UDM 


More explicitly, span(M@) Cc V can be described as the set of all finite linear 
combinations of vectors in M. Indeed, such linear combinations form a vector space 
and belong to every vector subspace containing M. 


Exercise 6.12*. Show that span(M) coincides with the intersection of all hyper- 
planes containing M. 


6.4.3 Sum of Subspaces 


The union of vector subspaces is almost never a vector space. For example, the 
union of the 1-dimensional subspaces spanned by the standard basis vectors e; and 
€2 in the coordinate plane k? does not contain the sum e; + e. 


Exercise 6.13 Show that the union of two vector subspaces is a vector space if and 
only if one of the subspaces is contained in the other. 


For a collection of subspaces U,, C V, the linear span of their union is called the 
def 


sum of the subspaces U, and is denoted by }-,, U, = span, U,. It consists of all 
finite sums of vectors u, € U,,. For example, 


U; + U2 = {uy + up | uy € U1, uz € Ud}, 
U; + U2 + U3 = {uy + un + U3 | uy € Uy, uz € U2, uz € U3}, 


Proposition 6.3. For any two finite-dimensional vector subspaces U, W in an 
arbitrary vector space V, we have the equality 


dim(U) + dim(W) = dim(U N W) + dim(U + W). 


Proof Fix some basis e), é@2,...,e, in UM W and extend it to bases in U and W 
by appropriate vectors uw, u2,...,u, € U and wi, w2,..., Ws € W respectively. It is 
enough to check that the vectors e), €2,...,€¢, Uj, U2,...,Uy, W1,W2,...,Ws forma 


basis in U + W. They certainly generate U + W. If there exists a linear relation 
Ayer tAger +++ +A peKA VIM + V2U2 +++ + V,-Up + Wi Wi +p2w2+-+-+Usws = 0, 
we can move all the vectors w;, W2,..., Ws to the right-hand side and get the equality 


Aye + Are. ++ tb Age + Vi + VU +++ Vp = MW — [LaW2 — +++ — [bsWs, 
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whose left-hand side lies in U, whereas the right-hand side lies in W. Thus, both 
sides are linear expansions of the same vector v € UM W through the bases of U 
and of W respectively. By the construction of these bases, all coefficients v; and 1; 
must vanish, which means that v = 0. Hence, all the 4; are equal to zero as well. O 


Corollary 6.7 For any two subspaces U, W of a finite-dimensional vector space V, 
the inequality 


dim(U N W) = dim(U) + dim(W) — dim(V) 


holds. In particular, if dim(U) + dim(W) > dim V, then UN W F 0. 


Proof This follows at once from Proposition 6.3 and the inequality dim(U + W) < 
dim V. oO 


Corollary 6.8 The following conditions on finite-dimensional subspaces U,W C V 
are equivalent: 


(1) dim(U + W) = dim U + dim W; 

(2) UNW=0; 

(3) every vector v € U + W has a unique decomposition v = u+w, where u € U, 
we WwW. 


Proof Conditions (1), (2) are equivalent by Proposition 6.3. Let us prove the 
equivalence of (2) and (3). If there is some nonzero vector v € U1 W, then 
0+0=0=v+(—v) are two different decompositions of the zero vector as a sum 
u+w,u€U,we W. Conversely, the equality u; + Ww, = U2 + W2 for uw, U2 € U, 
W1,W2 € W implies the equality uw; — uz = w2 — Wy, in which wu; — uz € U, whereas 
W2—w, € W. The condition UMW = 0 forces the vector uj —u2 = wo—w, € UNW 
to be zero. Hence uy = uz and wy = Ww. oO 


6.4.4 Tranversal Subspaces 


Two subspaces U,W C V are said to be transversal if they satisfy the conditions 
from Corollary 6.8 above. A sum of transversal subspaces is called a direct sum and 
is denoted by U @ W. Transversal subspaces U, W C V are called complementary, 
if U @ W = V. By Corollary 6.8, two subspaces U, W C V are complementary if 
and only if UM W = 0 and dim(U) + dim(W) = dim(V). 


Exercise 6.14 Given a linear map € : V > k and a vector v € V such that &(v) # 
0, show that the 1-dimensional subspace k- v spanned by v is complementary to the 
hyperplane ker &. 
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More generally, a sum of subspaces Uj, U2,...,U, C V is said to be direct , 
denoted by 


U; ®U.@--- PBUn, 


if every vector w € U;+U2 +---+ U, admits a unique expansion w = uj+u2 
+++++ Up, where u; € U;. In other words, a sum of subspaces U), U2,..., Um C V 
is direct if and only if every collection of vectors uy, u2,...,Um, uj € Uj, is linearly 
independent. For example, the vectors e),é2,...,@, form a basis of V if and only 
if V decomposes as the direct sum of n 1-dimensional subspaces spanned by the 
vectors e;. 


Exercise 6.15 Show that a sum of subspaces is direct if and only if each of them is 
transversal to the sum of all the others. 


6.4.5 Direct Sums and Direct Products 


Given a family of vector spaces V,, where v runs through some set X, the direct 
product of abelian groups!® Tl.ex Vv is turned into a vector space by means of 
componentwise multiplication by scalars. Thus, A+ (u,) + - (wy) & (Au, + pw,) 
for any two families (u))yex, (Wy)vex € [] Vy. The resulting vector space is called 
the direct product of the vector spaces V,,. The families (v,) having just a finite 
number of nonzero elements v, form a subspace in the direct product. This subspace 
is denoted by @, Vy C [],ex Vo and called the direct sum of the vector spaces 
V,,. For all finite sets of spaces V;, V2,..., V,, the direct sum is equal to the direct 


product: V; @ V2 @ --- Vn = Vi X V2 X +++ X Vy. 


Exercise 6.16 Let the vector space V be the direct sum of subspaces 
U,, U2,...,Um C V in the sense of Sect. 6.4.4. Show that V is isomorphic to 
the direct sum of spaces U; considered as abstract standalone vector spaces. 


For an infinite set X of vector spaces V,,, the direct sum @ V, is a proper subspace 
in the direct product [| V,. For example, the direct sum of a countable family of 1- 
dimensional vector spaces Q is isomorphic to the space of polynomials Q|x] and 
is countable as a set, whereas the direct product of the same family of spaces is 
isomorphic to the space of formal power series Q[x] and is uncountable.!” 


'6See Sect. 2.5 on p. 30. 
Compare with Example 6.10 on p. 130. 
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6.5 Affine Spaces 


6.5.1 Definition and Examples 


Let V be a vector space over a field k. A set A is called an affine space over V if for 
every vector v € V,a shift transformation'® t, : A > A is given such that 


(1)t =Id4, (2) Vuswe VV, Tey = Tut, (6.25) 
(3) Vp,q€Aduniquev € V: t,(p) = 4. (6.26) 


The first two conditions (6.25) assert that all the shift transformations t,, v € V, 
form an abelian transformation group!’ of the set A. The elements of this group 
are in bijection with the vectors v € V. The opposite vectors v and —v correspond 
to the inverse shift transformations t, and t_, = 1, ! The third condition (6.26) 
means that each point g € A can be obtained from an arbitrarily given point p € A 
by a unique shift transformation t,. We write Pq for the vector producing this 
transformation. One can think of this vector as an arrow drawn from p to q. It follows 
from (6.25) that pp = 0 for all p € A and pg + gr = pr forall p,q.r € A. This 
forces pq — —@p. 


Exercise 6.17 Prove that pg = 7S <> pr = qs. 


A shift transformation t, can be thought of as an operation of adding a vector 
v € V to the points p € A. For this reason, we often write p + v instead of t,(p). 

By definition, the dimension of an affine space A associated with a given vector 
space V is equal to dim V. 


Example 6.14 The set of monic polynomials of degree m forms an affine space of 
dimension m over the vector space V = kx] <(m—1) of polynomials of degree at most 
m — 1. The shift transformation t,, h € V, takes each monic polynomial f to f + h. 
Any two monic polynomials f, g of degree m differ by a polynomial h = f—g € V, 
which is uniquely determined by f, g. 


Example 6.15 The previous example is a particular case of the following situation. 
For a vector subspace U C V and vector v € V, letA=v+U={v+u|ueU} 
be the shift of U by the vector v. Then A is an affine space associated with U. For 
h € U, the shift transformation t, maps v-+u +> v-+u-+h. Any two points p = v+u 
and g = vu + w differ by the vector pq = w-—ué€ U, uniquely determined by p, gq. 


18 Also called a parallel displacement. 
'See Sect. 1.3.4 on p. 12. 
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6.5.2 Affinization and Vectorization 


A vector space V produces an affine space A(V) over V called the affinization of 
V. By definition, the points of A(V) are the vectors of V. The shift transformation 
Ty : A(V) > A(V) takes v to v-+w. One can think of the points in A(V) as the heads 
of radial vectors v € V drawn from the origin, which is the point corresponding to 
the zero vector. 


Exercise 6.18 Check that properties (6.25), (6.26) are satisfied in A(V). 


We write A” = A(k") for the affinization of the n-dimensional coordinate vector 
space. 

Conversely, every affine space A associated with a vector space V can be 
vectorized as follows. Choose some point p € A, which will be called the center 
of the vectorization or the origin, and consider the bijective map 


vecp : AV, qt Pa, 


which sends a point to its radial vector drawn from the origin. This map provides 
A with a vector space structure transferred from V. The resulting vector space is 
called the vectorization of A centered at p. Note that the vector space structure on 
A obtained by means of a vectorization depends on the center of vectorization. Two 
addition operations A x A — A obtained under different choices of the origin are 
different at least in that they have different neutral elements. 

Every collection p, e),é2,...,é@, that consists of a point p € A and a basis 
€1,€2,..-,€, Of V is called an affine coordinate system in A. The coefficients x; 
of the linear expansion 


Pa = xipqi + x2pq2 + +++ + xnPGn 


are called the affine coordinates of the point qg in the affine coordinate system 
Ds 1, €2,+++5€n- 


6.5.3 Center of Mass 


Given a collection of points pi, p2,...,Pm in an affine space A and a collection of 
constants [11, }42,..-, {4m € k such that }° jz; 4 0, there exists a unique point c € A 
such that 


[icp + M2cp2 + +++ + UmePm = 0. (6.27) 
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Fig. 6.1 Moments of inertia Hg 
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Indeed, if we write the linear combination (6.27) for any other point gq in the role of 
c, then the difference between the two sums is 


Yo Pi — Yo miPi = do ui (Pi - Bi) = (So mi) 7 = LG 


where jt = )> j4;. Therefore, if g is fixed and c varies through A, the left-hand side 
of (6.27) vanishes for a unique point c that satisfies jv - ge = ys. Ligh. We conclude 
that the point 


Ct Dit bat oo + Tn, where w= YH (6.28) 


is independent of the choice of g and is the only point in A satisfying (6.27). This 
point c is called the center of mass”® for the points p1,p2,....Pm € A taken with 
MASSES [L1, [L2,.--,/4m € k. The terminology comes from mechanics, where the 
ground field is k = R. For the affine space R” immersed as a horizontal hyperplane 
in R"t! (see Fig. 6.1), the vectors Di are called moments of inertia with respect to 
the point c for the forces jz; acting at p; vertically upward if uw; > 0 and downward 
if 44; < 0. The vanishing of the total sum of the moments means that the hyperplane 
IR” fastened like a hinge only at the point c remains balanced within R’*! under the 
action of all the forces. 
In the special case p = >> 4; = 1, the point c defined in (6.28) is denoted by 


Mp + Popo + +++ + MnP (6.29) 


and called the barycenter of points p; with weights j;. The expression (6.29) is 
called a barycentric combination of the points p; with weights j1;. Let me stress that 
this expression makes sense only for }> 4; = 1. 


Exercise 6.19 Consider the vectorization vec, : A ™V,qrh og , centered at some 
point o € A and define the point c € A by oe = [10P\ + [L20p2 foeee LmOPm- 
Check that c does not depend on the choice of o if and only if )> wu; = 1. 


Or center of gravity. 
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If each point p; in the barycentric combination (6.29) is a barycentric combination 
of some other points gj with weights xj, i.e., pj = yi Xyqij, then the barycentric 
combination (6.29) is simultaneously a barycentric combination of the points qj, 
because D0; Mi D0; Xy4ij = Dy MiXyqi, and new weights Aj = 4;xz have sum | as 


well: ey Ai = hi epi = imi = 1. 


Exercise 6.20 (Mass Grouping Theorem) Consider two collections of points p;, 
qj with weights 4;, 1; such that the total weights A = )°A;, w = 0p; and their 
sum A + jz are nonzero. Write p and q for the centers of mass of the weighted points 
pi and q; respectively. Show that the center of mass of the points p, g with weights 
A, coincides with the center of mass of the total collection of weighted points”! 


Pi» j- 


Example 6.16 (Convex Figures in R") Let k = R, and let A” be an affine space 
over R". A barycentric combination )* A; - p; of points p; € A” is said to be convex 
if all the A; are nonnegative. The set of all convex combinations of two given points 
a, b is denoted by 


[pal=ptuq|Atma=l,A,u>0} 


and called the segment with endpoints a, b. The set ® C A” is convex if every 
convex combination of every finite collection of points p; € ® belongs to ®. For 
example, every segment [a, b] is convex. Since every finite convex combination can 
be written as a convex combination of two appropriate points by Exercise 6.20, a 
set ® is convex if and only if every segment with endpoints in ® lies within ®. The 
intersection of convex figures is clearly convex. For M C A", the intersection of all 
convex sets containing M is called the convex hull of M and is denoted by conv(M). 
Equivalently, conv(M) consists of all finite convex barycentric combinations of 
points in M. 


Exercise 6.21 Verify that the set conv(M) is really convex. 


6.5.4 Affine Subspaces 


Let A be an affine space over some vector space V. For a given point p € A and 
vector subspace U C V, the set (p, U) = p+ U = {t,(p) | u € U} is obviously 
an affine space over U. It is called the affine subspace of dimension dim U passing 
through p and parallel to U. The subspace U is also called a direction subspace of 
the affine subspace I1(p, U). 


*IBach pair of coinciding points p; = q; appears in the total collection with weight A; + j4;. 
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Example 6.17 (Lines and Planes) Affine subspaces of dimensions | and 2 are 
called lines and planes. Therefore, an affine line is nothing but the “trajectory of a 
free particle,” that is, a locus of points p+ ut, where p € A is some “starting” point, 
v € Vis some nonzero “velocity” vector, and “time” ¢ runs through the ground field 
k. Similarly, an affine plane is a locus of points p + Au + kw, where p € A is some 
base point, u, w € V are some nonproportional vectors, and the coefficients A, 4 run 
independently through k. Of course, every line (or plane) has many different such 
representations depending on the choice of starting points and velocity vectors. 


Proposition 6.4 Let the affine subspaces I(p, U), T1(g, U) share the same direc- 
tion subspace U C V. Then the following properties are equivalent: 


(1) pq € U, 

(2) TI(p, U) = (q, U), 

(3) TI(p, U) N T(q, U) # @, 
(4) p € TI(q, U), 

(5) q € M(p, U). 


Proof We first check that (1) > (2). If pq € U, then every point g + u, u € U, can 
be written as p + w for w = Pa +u € U. Conversely, every point p+ w, w € U, can 
be written as p + u for u = w — pg € U. Hence, TI(p, VU) = M(q, U). Condition 
(2) certainly implies (3), (4), (5), and each of conditions (4) and (5) implies (3). 
Thus, to finish the proof it is enough to show that (3) > (1). Letr = p+u' = 
q+u" € T(p,U)N H(q, UV), where both wu! = pr and u" = GF belong to U. Then 


pg =pr+7rg@=u—u' eU. o 
Lemma 6.3 Given k + 1 points po, pi,...,px in an affine space A over a vector 


space V, the vectors 


_—> => —= 
PoP1; PoP2, +++» POPk 


are linearly independent in V if and only if the points po, pi, ..., Px do not belong 
to a common affine subspace of dimension less than k in A. 


Proof The vectors PoP ; PoPr or PoP are linearly related if and only if their linear 
span has dimension less than k. The latter is equivalent to the existence of a linear 
subspace U C V such that dim U < k and PoPi € U for all i. These two conditions 
say that the affine subspace po + U has dimension less than k and contains all points 
Di- oO 


Definition 6.4 Points po, p1,...,px € A satisfying the conditions from Lemma 6.3 
are called linearly general or affinely independent. 


Proposition 6.5 For any linearly general k + 1 points in an affine space A, there 
exists a unique k-dimensional affine subspace containing these k + 1 points. If 
dimA => k + 1, then the converse statement is also true. 


Proof An affine subspace po + U contains all points p; if and only if all vectors 
Popi belong to U. The linear independence of these vectors means that they form 
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a basis in a k-dimensional vector subspace U C V containing them all. Therefore, 
every such subspace coincides with the linear span of the vectors PoPi- Conversely, 
let the vectors PoP1+PoP2 ihe .PoPk span subspace U of dimension £ < k. Then 
n= dimV => € + 2, and there exists a basis e),€2,...,€, in V such that the first @ 
vectors €1, @2,...,e¢ forma basis in U. Write W’ and W” for the linear spans of the 
vectors7” 


C1, - + C0, Cl4+1, C043,--- €k+1 and e],... ee, €g4+2, C043, .-- .@k+1- 
The affine subspaces po + W’, po + W” are distinct and k-dimensional, and each 
contains all the points p;. Oo 


Example 6.18 (Linear Equations: Variation of Example 6.12) Write U C k" for the 
linear space of solutions of the homogeneous system of linear equations 


QyX1 + Ay2X2 + ++ + AynX_ = O,7 

A21X1 + d72X2 + +++ + ArpXn = 0, 

431X1 + 432X2 + +++ + A3nXn = O, (6.30) 
Am1X1 + Am2X2 + +++ + AmnXn = 0. 


Equivalently, U = ker A, where A : k” — k’ is the linear map with matrix A = (a,) 
in the standard bases of k” and k”. For a point p € A” = A(k”), the affine subspace 
p+U = p+ kerA is the set of solutions of the inhomogeneous system of linear 
equations A(x) = b whose right-hand side is given by b = A(p) € k”. This is just 
another reformulation of the equality A~'(A(p)) = p + kerA from Proposition 2.1 
on p. 32. We conclude again that the solution set of the system 


AyjX1 + Ay2X2 + 2+ + AynXy = Dj, 
Ax1X1 + n2X2 + ++ + danXn = bo, 
31X1 + 32X2 + +++ + 43nX_ = D3 
Ami X1 + Am2X2 + +++ + AmnXn = bm, 


either is empty (for b ¢ im A) or is an affine subspace in A” with direction subspace 
U = kerA, the solutions of the homogeneous system (6.30). 


2Hach collection contains k vectors and is obtained by removing either the vector e+ or the 
vector é¢4, from the collection e;,e2,..., éx41 of k + | vectors. 
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6.5.5 Affine Maps 


Let g : A — B bea map of affine spaces A, B associated with vector spaces U, W. 
Choose an origin p € A and take ¢(p) as the origin in B. Then g induces the map of 
vectorizations 


= ———_ 
Dog: U>W, pat o(p)y). (6.31) 


Lemma 6.4 /f a map (6.31) is linear for some p € A, then it does not depend on 
the choice of p € A. 


Proof If Dy¢ is linear, then for every point r € A and vector u = 1G = pq—pr € U, 


ee Se PP. ee = = 
D,g(u) = e()e(q) = O(P)9(q) — Y(P)9(7) = Dpg (Pq) — Dyg (pF) 
= Dyg (pq — pr) = Dp glu). 
Therefore Dp = Dp. oO 


Definition 6.5 Let A, B be affine spaces associated with vector spaces U, W. A map 
gy : A — Bis said to be affine if the associated map (6.31) is linear and therefore 
does not depend on p. In this case, the linear map (6.31) is denoted by Dg : U > W 
and is called the differential of 9. 


Proposition 6.6 Let A, B, C be affine spaces associated with vector spaces U, V, 
W. For two affine maps vy : A > B, 9 : B > C, the composition go yw: A —> C is 
affine, and D(g oy) = (De) ° (DY). 


=> i ere ar aES => 
Proof Doly °W) : a> gW(P) EWG) = De (W(pWG@)) = Dye DY). 0 


6.5.6 Affine Groups 


It follows from Proposition 6.6 that the bijective affine maps g : A(V) > A(V) form 
a transformation group of the affine space A(V). This group is called the affine group 
of V and is denoted by Aff(V). It contains the subgroup of shift transformations 
Ty 1p +> p + v, which are in bijection with the vectors v € V. 


Proposition 6.7 An affine endomorphism g : A(V) — A(V) is bijective if and 
only if Dp : V — V is bijective. An affine automorphism @ is a shift if and only if 


Proof Both statements follow from the equality g(p + v) = g(p) + Deg(v). Oo 
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6.6 Quotient Spaces 


6.6.1 Quotient by a Subspace 


Let V be a vector space over an arbitrary field k. Every vector subspace U C V 
provides V with an equivalence relation v = w (mod U), meaning that v — w € U, 
called congruence modulo U. 


Exercise 6.22 Check that this is really an equivalence relation on V. 


We write [v]y = v (mod U) = v+U = {we V|w-—v e U} for the equivalence 
class of a vector v € V modulo U. This class coincides with the affine subspace 
II(v, VU) Cc A(YV) parallel to U and passing through v € A(V). The congruence 
classes modulo U form a vector space. Addition and multiplication by scalars are 
defined by the usual rules 


[vl t+ [w]=[v+w] and Afv] S fAr]. 


Exercise 6.23 Check that both operations are well defined and satisfy the axioms 
of a vector space over k. 


The vector space of congruence classes modulo U is denoted by V/U and called 
the quotient space of V by U. The quotient map z : V —» V/U, v & [v], is linear 
and surjective. The vectors of V/U are in bijection with the affine subspaces in A(V) 
having U as the direction subspace. 


Example 6.19 (Quotient by the Kernel) Associated with a linear map F : V > W 
is a canonical isomorphism 


V/krF> imF, [v] / F(v). (6.32) 


Therefore, every linear map F : V — Wcan be decomposed into the quotient epi- 
morphism V —> V/ ker F followed by the monomorphism V/ ker F ~ im F @ W. 


Exercise 6.24 Verify that the map (6.32) is well defined, linear, and bijective. 


Proposition 6.8 Let V be a vector space, U C Vasubspace, andR C U,S C V~U 
two sets of vectors such that R is a basis in U and RU S is a basis in V. Then the 
congruence classes [wly, w € S, are distinct and form a basis in V/U. For finite- 
dimensional V, this leads to the equality dim(V/U) = codim U. 


Proof This follows from Proposition 6.1 on p.135 applied to the quotient map 
V—>V/U. Oo 
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Example 6.20 (Linear Span as a Quotient Space) The linear span 
W = span(w1, W2,..-,Wm) 


of any collection of vectors w1, w2,...,Wm € V can be viewed as the image of the 
linear map F': k” — V that sends the ith standard basic vector e; € k’” to w; € W. 
The kernel of this map U = ker F C k” is nothing but the space of linear relations 
among the vectors w; in V, because it consists of all rows (A,,A2,...,Am) € k” 
such that Ayw, + Agw2 +--+ + Agw, = 0 in V. The isomorphism W ~ k”/U 
from Example 6.19 says that the vectors w € W can be viewed as congruence 
classes of coordinate rows (x;,X2,...,%n), which encode the linear combinations 
XjW, + X2W2 + +++ + xX,w,, modulo those rows that produce the linear relations 
>> x;w; = 0 among the vectors wj. 


6.6.2 Quotient Groups of Abelian Groups 


The constructions of quotient space and quotient ring’ are particular cases of the 
more general construction of a quotient group. For additive abelian groups, it is 
described as follows. Let A be an arbitrary abelian group and B C A any subgroup. 
Two elements a1, a2 € A are said to be congruent modulo B if a, — az € B. This 
is an equivalence relation on A. We write a; = az (mod B) for congruent elements 
and denote by [a] = a + B the congruence class of an element a € A. 


Exercise 6.25 Verify that congruence modulo B is an equivalence and the set 
of congruence classes inherits the abelian group structure, defined by [a,] + [ap] 
def 

= [ay + ay]. 


If besides the abelian group structure, A supports the structure of a K-module, then 
this K-module structure can be transferred to the quotient group A/B as soon B is a 
K-submodule of A. In this case, multiplication of a congruence class [a] € A/B by 
aconstant A € K is well defined by A[a] © [Aq]. 


Exercise 6.26 Check this. 


Therefore, for every K-module A and K-submodule B, the quotient module A/B 
is defined, and the quotient map A —> A/B,at> [a]z, is K-linear. For A = K and B 
equal to an ideal J C K, we get a quotient ring K/J. ForK = k,A=V,B=UCYV, 
we get the quotient space V/U. 


3See Sect. 5.2 on p. 106. 
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Exercise 6.27 Construct the isomorphism A/ ker F +> im F for a K-linear map of 
K-modules F : A > B. 


Problems for Independent Solution to Chap. 6 


Problem 6.1 Prove that the following collections of functions are linearly indepen- 


dent in the space of all functions R > R: (a) 1, sinx, sin(2x), ... , sin(nx), 
(b) 1, sinx, sin?x, ... , sin” x, (c) e*, ... ,e*™, all A; € R distinct, 
(d) x*!, ... ,x, all A; € R distinct. 


Problem 6.2 Show that R is an infinite-dimensional vector space over Q and prove 
that the following collections of real numbers are linearly independent over Q: 
(a) /2, V3, V5, (b) oo gies — pel, where p; are distinct prime 
integers and all m;/n; are in Q ~ Z. 

Problem 6.3 Ascertain whether the given collection of functions is linearly inde- 
pendent in the space of all functions F,, — F,, over the field F,,: 

(a) 1, x, x7, 22. ,x?7!, (b) x, x7, 20. a”. 


Problem 6.4 Find the dimensions of the following vector spaces over Q: 


(a) Polynomials of total degree”* at most d in Q[x),x2,.... Xm]. 

(b) Homogeneous degree-d polynomials in Q[x, x2, ..., Xml. 

(c) Homogeneous symmetric polynomials of degree 10 in Q[x1, x2, x3, x4]. 

(d) Homogeneous symmetric polynomials of degree at most 3 in Q[x1, x2, x3, Xa]. 


Problem 6.5 Let ¢ ¢ C x R be a nonreal complex number, say ¢ = 3 — 2i. Find the 
dimension (over R) of the space of all polynomials f € R[x] such that degf < n 


and f(¢) = 0. 


Problem 6.6 Show that the given collection of polynomials forms a basis for the 
vector space Q[x]<,. Write the matrices of the linear maps D : f(x) & f’(x), 
V:f@)b f@ —f(x—1), A: f@) be f(x + 1) — f(x) in the given basis: 


(a) B(x) = (x +k)", 0<k <n, 
(b) yo(x) = Land x(x) = (4) = + 1) @+K/KL ISK <n. 


Problem 6.7 For an arbitrary polynomial g(x) = aox" +a,x" | +++++ay—1xX+an € 
k[x], consider the quotient ring V = k[x]/(q) as a vector space over k. Show that 
a basis in V is formed by the residues e, = x” (mod q) for 0 < v < degq—1 


*4By definition, the total degree of a monomial x xs +++ xhm is equal to a + a2 + +++ +a,,. The 


total degree of a polynomial f is defined as the maximal total degree of the monomials in F’. 

5A polynomial f(x1,x2,..., Xm) is symmetric if f (Xg,.Xeo4-+ + Xen) = f(X1,%2,---5 Xm) for every 
permutation g = (g1, g2,..., 2m) € Sm. For example, the polynomial (x; — x2)? (x; — x3)? (42 — x3)? 
is symmetric, whereas the polynomial (x; — x2)(x1 — x3)(x2 — x3) is not. 
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and write the matrix of the linearmap F:V—> V,_ [f] > [x/] in this basis. For 
q(x) = (x— A)", where 4 € k, find a basis in V in which F has the matrix 


Al 0 
a7 

-1 

0 Xr 


Problem 6.8 (Finite Spaces) In an n-dimensional vector space over a finite field 
of gq elements, for each k = 1, 2,...,d find the total number of (a) ordered 
sequences of k linearly independent vectors, (b) k-dimensional vector subspaces. 


For fixed k,d € N, write (Ds for the answer to (b) considered as a function of g. 


Compute lim,-,1 Qs 
Problem 6.9 Does there exist an inclusion of fields Fy — F7? 


Problem 6.10 Let V be a vector space. Suppose the vectors uw), U2,...,Uz € V are 
linearly independent and that the vectors e;, @2,...,e, € V satisfy the following 
property: for each i = 1, 2,...,k, the vectors e), ... ,@-1, Uj, C41, --- en 
form a basis in V. Is it true that for each i, the vectors u,, ... , Uj, @i41, --- 5 €n 
form a basis of V? 


Problem 6.11 Show that every collection of n+2 vectors in an n-dimensional vector 
space admits a nontrivial linear relation with the sum of the coefficients equal to 
Zero. 


Problem 6.12 Give an example of a finite-dimensional vector space W and a triple 
of mutually transversal subspaces U,V,T C W such that dimU + dimV + 
dimT = dimW buttWAU@VOT. 

Problem 6.13 Let dim(U + V) = dim(U M V) + 1 for some subspaces U, V in a 
given vector space. Show that U + V equals one of the subspaces U, V and that 
UN V equals the other one. 

Problem 6.14 Suppose a collection of k-dimensional subspaces W,, W2,..., Win C 
V satisfies the property dim W; N W; = k — | for all i A j. Show that there exists 
either a (k — 1)-dimensional subspace U C V contained in each of the W; or a 
(k + 1)-dimensional subspace W C V containing all the W;. 

Problem 6.15 Prove that over an infinite ground field, no finite union of proper 
vector subspaces exhausts the whole space. 

Problem 6.16 Construct the following canonical isomorphisms for an arbitrary 
triple of vector spaces U, V, W: 


(a) Hom(U @ W, V) ~ Hom(U, V) 6 Hom(W, V), 
(b) Hom(V, U 6 W) ~ Hom(V, U) 6 Hom(V, W). 


Problem 6.17 Let dim U = n, dimW = m. Assume that subspaces Up C U and 
Wo C W have dimensions no and mp respectively. Show that the set of linear maps 
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{F : U > W | kerF D Up and imF C Wo} is a vector subspace in Hom(U, W) 
and find its dimension. 

Problem 6.18 (Projectors) For a nontrivial idempotent linear endomorphism7° 
F:V— V, put Vo = kerF, Vj = ker(F — Idy). Show that V = Vo @ Vi 
and F(vp + v1) = v; for all vo € Vo, v1 € Vj. 


Problem 6.19 (Involutions) For a nontrivial linear involution?’ F : V > V put 
Vi ={v EV | Fv =v} = ker(F —Idy), 
V_ = {v EV | Fu = —-v} = ker(F + Idy). 


Show that V_ = im(F —Idy), Vi = im(F + Idy), and V= Vi @V_. 

Problem 6.20 Verify that for every pair of linear endomorphisms F,G : V > V, 
one has ker(FG) C ker(G) and im(FG) C im(F). Give some finite-dimensional 
examples for which these inclusions are strict. 


Problem 6.21 Prove that for every linear endomorphism F : V — V of a finite- 
dimensional vector space V: 


(a) ker(F*) = ker(F**!) > Vn EN ker(F*) = ker(F**”) . 
(b) im(FS) = im(FE!) => Vn € N im(FS) = im(FE*"), 
(c) Vn EN dimker(F”) = )°y_» dim (im F* M ker F), where F° © Idy. 


Problem 6.22 Given a collection of points pi,p2,...,px in an affine space, a 
segment joining one of the points with the barycenter of all the other points taken 
with equal weights | is called a median of the given collection of points. Show 
that all k medians meet in one point and that this point divides each median in the 
ratio’® 1 : k. 

Problem 6.23 Given a collection of points pi, p2,..., Pm in an affine plane A”, is it 
always possible to choose points q1, g2,..., 4m Such that p1, p2,...,Pm—1, Pm are 
the midpoints of the respective segments”? 


[q1. 92], [g2.93], --- »[Gm—1. 9m]. [4m, 91] ? 


Problem 6.24 (Barycentric Coordinates) Assume that points po,pi,-...,Pn € 
A” = A(k") do not lie in a common affine hyperplane. Show that the mapping 


(X0,X1, ---5 Xn) > XoPo + X1pi +++ + XnPn 


establishes a bijection between the collections of weights (%9,%1, ..-, Xn) € 
k"*! such that }>x; = 1 and the points of A”. The weights (a, a1, ..., Qn) 


6That is, such that F* = F but F 4 0 and F # Idy. 

27That is, such that F* = Idy but F # Idy. 

28We say that a point c divides the segment [a, 5] in the ratio w : B if B- ca +a: cb = 0. 
2° A point is the midpoint of a segment if that point divides the segment in the ratio 1 : 1. 
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corresponding to a given point a € A are called the barycentric coordinates of a 
with respect to the reference points po, P1,.--,DPn- 


Problem 6.25 For a triple of noncollinear reference points a, b,c € A(R?), find the 
barycentric coordinates of the points a, bi, cy such that b; is the midpoint of 
[a, ci], c1 is the midpoint of [b, ai], and a; is the midpoint of [c, bi]. 

Problem 6.26 Given a triple of noncollinear reference points a, b,c € A(R’), draw 
the locus of all points p whose barycentric coordinates (a, 8, y) with respect to 
a, b,c satisfy the following conditions: (a) a, 8, y > 0, (b) a, B > 0, y < 0, (c) 
a= 6,(d)a,B >1/3,y>0,@azfh,(f)atpezy. 

Problem 6.27 Under the conditions of the previous problem, write down some 
explicit systems of relations among the barycentric coordinates (a, 6, y) defining 
(a) six triangles cut out of A abc by its medians, (b) two triangles obtained from 
A abc by homotheties*” with ratios 3 and 1/3 with the center at the intersection 
point of the medians of A abc. 


Problem 6.28 Let a linear map F : V — W send a subspace U C V into a subspace 
T C W. Show that the mapping v (mod U) +> F(v) (mod T) defines a linear 
map Ff: V/U > W/T. 

Problem 6.29 For a tower of embedded vector spaces U C V C W, construct a 
linear embedding V/U — W/U and an isomorphism 


(W/U)/(V/U) > W/V. 
Problem 6.30 For a pair of vector subspaces U, W C V, construct an isomorphism 


(U+W)/U>W/(UNW). 


39 homothety with center c € A and ratio d € k isa map y,, : A" > A",ptect+ AG. 


Chapter 7 
Duality 


7.1 Dual Spaces 


7.1.1 Covectors 


Let V be a vector space over a field k. A linear map € : V > k is called a covector 
or linear form' on V. The covectors on V form a vector space, denoted by 


v* = Hom, (V,k) 


and called the dual space to V. We have seen in Sect. 6.3.2 on p. 136 that every 
linear map is uniquely determined by its values on an arbitrarily chosen basis. In 
particular, every covector € € V* is uniquely determined by numbers &(e) € k as e 
runs trough some basis of V. The next lemma is a particular case of Proposition 6.2 
on p. 137. However, we rewrite it here once more in notation that does not assume 
the finite-dimensionality of V. 


Lemma 7.1 Let E C V be a basis* of a vector space V. For a linear form @ : 
V > k, write ¢|g : E > k for the restriction of y to E C V. Then the assignment 
g +> ole gives a linear isomorphism between V* and the space k* of all functions 
E>k. 


Exercise 7.1 Verify that the map g +> 9|z, is linear. 


Proof (of Lemma 7.1) We construct the inverse map ke + V*,f v3 For a 
function f : E > k and vector v = 0 cpxce, put f(v) = reg Xef (e). 


' Also linear functional. 
Not necessarily finite. 
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Exercise 7.2 Verify that both maps f : V > k and f ++ f are linear. 


Since every linear form gy : V — k sends every vector }\j¢,%ee € V to 


0 (Deez xe) = ver xep(e), the equalities gle = v and f|g = f hold for all 
gy € V* andall f € k®. o 


Example 7.1 (Evaluation Functionals) Let V = k* be the space of all functions 
X — k on an arbitrary set? X. Associated with each point p € X is the evaluation 
functional ev, : V — k sending a function f : X — k to its value f(p) € k. 


Exercise 7.3 Check that ev, : V — k is a linear map. 


If the set X is finite, then the set of covectors {ev,},ex is a basis in V*, and the 
coordinates of every covector € € V* in this basis are equal to the values of & on the 
delta functions* 6, : X — k. In other words, every linear form € : V — k admits a 
unique linear expansion through the forms ev,, and this expansion is 


E= 5) E(5,)-evp. (7.1) 


pEx 


Indeed, we know from Example 6.8 on p. 129 that the delta functions form a basis 
in V = k*. Therefore, (7.1) holds as soon both sides take equal values on each delta 
function 6,. The latter is certainly true, because 


1 ifp=q, 
eVp (dq) = 5q(P) = . 
0 otherwise, 


by the definition of the delta function. For the same reason, every linear expansion 
E = Vinex Ap Vp, Ap € kk, must have A, = §&(5,): just evaluate both sides at 54. 


Example 7.2 (Coordinate Forms) Let E C V be any basis. For each basis vector 
e € E, write e* : V — k for the covector taking each vector v = )\jep-x%ee € V 
to its coordinate x, = x,(v) along e. In other words, the set of covectors {e*}.eg is 
uniquely defined by the equality v = )°,<,e*(v) - e, holding for all v € V. 


Exercise 7.4 Check that each map e* : V > k is linear. 


The covectors e* are called coordinate forms of the basis E. In terms of Lemma 7.1, 
they correspond to delta functions 6, : E — k; ie., the values of e* on the basic 
vectors w € E are 


e*(w) = 1 ifw=e, (7.2) 
0 ifwe. 


3See Example 6.8 on p. 129. 
4See formula (6.15) on p. 129. 


7.1 Dual Spaces 157 


Proposition 7.1 For every basis E C V, the set of coordinate forms {e*}ecer is 
linearly independent in V*. If E is finite, they form a basis of V*. In particular, 
dim V = dim V* in this case. 


Proof Evaluation of both sides of the linear relation? )°,<,A-e* = 0 at a basic 
vector w € E leads to the equality A,, = 0. Therefore, the covectors e* are linearly 
independent. If £ is finite, then every linear form gy : V — k can be linearly 
expanded through the forms e* as 


y= Di yle)-e. 


ecE 


Indeed, by (7.2), the values of both sides applied to each basic vector w € E are 
equal to y(w). Oo 


Caution 7.1 If a basis E of V is infinite, then the coordinate forms e* do not span 
V*. The restriction isomorphism V* + k* from Lemma 7.1 sends the coordinate 
form e* to the delta function 6, : E > k, whose support is just one point. The linear 
span of the delta functions consists of all functions E — k with finite support. If E 
is infinite, the space of all functions is strictly larger. It may have a larger cardinality 
even as a Set. For example, fork = Q, E = N, the set of finitely supported functions 
N — Q is countable, whereas the set of all functions is uncountable. 


Example 7.3 (Power Series as Linear Forms on Polynomials) The space of polyno- 
mials k[x] has the standard countable basis formed by monomials x*. By Lemma 7.1, 
the dual space k[x]* is isomorphic to the space of sequences f : Zso0 > k, f, = 
g(x*). The latter are in bijection with their generating power series f(t) = )~ K0 ft. 
Therefore, k[x]* ~ k[r], and a linear form f : k[x] — k corresponding to the series 


f=fotfixtfhx +--+ ekp] 


maps f 5 dg Faux ters + anx”" > fodo + fia +--+ +finam. For example, for every 
a € k, the evaluation form ev, : k[x] — k, A(x) > A(a), comes from the power 
series 


oe =(l—ar). 


k20 


Exercise 7.5 Show that the set of geometric progressions {( - at)'} is 


linearly independent in k[¢]. 


ack* 


In particular, every basis of R[#]* over R should have at least the cardinality of the 
continuum, whereas R[] has a countable basis. 


Recall that all but a finite number of the A, equal zero. 
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7.1.2. Canonical Inclusion V ~ V** 


Associated with every vector v € V is the evaluation form 
ev, :V* ok, gr ov), 


which is a linear form on the dual space V*, that is, an element of V**. It takes a 
covector g € V* to its value g(v) € k on V. Since the latter depends linearly on 
both v and g, we get the canonical® linear map 


ev:V>V**, vbev,. (7.3) 


Theorem 7.1 The canonical map (7.3) is injective. If dimV < ox, then it is an 
isomorphism. The latter means that every linear form — : V* —> k coincides with 
evaluation ev, : V* — k for some vector v € V uniquely determined by &. 


Proof Injectivity means that for every nonzero vector v € V, there exists some 
covector y € V™* such that ev,(y) = g(v) 4 0. By Theorem 6.1 on p. 132, the 
vector v can be included in some basis E of V. Then the coordinate form g = v* has 
the required property. If V is of finite dimension, then dim V = dim V* = dim V** 
by Proposition 7.1, and therefore, the injective map (7.3) has to be surjective. oO 


7.1.3 Dual Bases 


The previous theorem says that dual finite-dimensional vector spaces V and V* 
are in a completely symmetric relationship with each other.’ In particular, each 
basis 91, @2,...,@n in V* consists of coordinate forms yg; = e} for some basis 
€1,€2,..-,@, in V uniquely determined by the basis ¢, 2,...,@n. Namely, the e; 
are the vectors whose evaluation forms ev,, coincide with the coordinate forms y* 
of the basis 91, @2,...,@n in V*. Bases €1,€2,...,@, € V and ef,e5,...,e7 € V* 
are said to be dual if each of them consists of the coordinate forms for the other, 
that is, 


1 ifi=j, 


e; (e;) = eve (e7) = 
ie) KI= V9 ifizj. 


®Meaning that it does not depend on any extra data such as the choice of basis. 


7For infinite-dimensional spaces, this is not true, as we saw in Example 7.3. 
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Exercise 7.6 Assume that dimV = n and let vj,v2,...,U, € V and 
f1,92,--.,Qn € V* satisfy the conditions g;(v;) = 1 and g,(v;) = 0 fori = j. 
Show that 


(a) Both collections of vectors are bases. 


(b) The ith coordinate of every vector v € V in the basis v1, v2,..., VU, is equal to 
gi(v). 

(c) The ith coordinate of every covector € € V* in the basis ¢1, Y2,..., @n is equal 
to € (vj). 

Example 7.4 (Lagrange’s Formula) Every n + 1 distinct constants dg, a,,...,@n € 


k produce n + 1 evaluation forms on the space k[x]<, of polynomials of degree at 
most n: 


GQ = eVg, : kP]en > k, fre f(a). 


The polynomial f;(x) = Tai — a,) vanishes at all points a, except for a;, where 
f(a) # 0. Therefore, the polynomials v;(x) = fi(x)/fi(a;) and evaluation forms 9; 
satisfy the conditions of Exercise 7.6. Thus, v; and gy; form dual bases in k[x]<, and 
k[x]Z,. This means that each polynomial g € k[x]<, admits a unique expansion 
as a linear combination of polynomials v;(x), and the coefficients of this linear 
combination equal g(qj), i.e., 


m m x— dy 
gx) = D7 8(ai): vi(x) = Dd 8(a) ‘| baa! (7.4) 
i=0 i=0 vAi 
This can be equivalently reformulated as follows. For every collection of constants 


80. 815+--8n Ek, 


there exists a unique polynomial g € k[x]<, such that g(a;) = g; for all i, 
and this polynomial is given by the formula (7.4), which is known as Lagrange’s 
interpolation formula. 


Example 7.5 (Taylor’s Formula) Assume that chark = 0, fix some point a € k, 
and for 0 < k <n write 


dé 
Gk = Va Ty tkk]en ok, fofe@, 


for a linear form that sends a polynomial f to the value of its kth derivative f™ at 
a, where we put f & f to make the notation uniform. The covectors @o, 1, .-. +n 
and polynomials vu, = (x — a)*/k! satisfy the conditions of Exercise 7.6. Hence, 
they form dual bases in k[x]z,, and k[x]<,. Therefore, every polynomial g € k[x]<p 
admits a unique expansion as a linear combination of polynomials v;(x), and the 
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coefficients of this linear combination equal g (a): 


(7.5) 


—A\2 = pt 
g(x) = g(a) +8'(a): (ea) + 8"(a)- ow ae 


X 
of ed goa. 
2 n! 


The expansion (7.5) is called the Taylor expansion of g at a. Note that this is an 
exact equality in k[x]. 


7.1.4 Pairings 


The symmetry between V and V* motivates the next helpful notion. Given two 
vector spaces U, W, a pairing between them is a map (*, *) : Ux W > k sending 
each pair of vectors u, w to a number (u, w) € k that is bilinear® in u, w, meaning 
that for all vj, v2 € V, all wy,w2 € W, and all Ay, Ao, 41, U2 € k, the following 
equality holds: 


(Ayvy + Arv2, iwi + [l2W2 ) 
= Aypr (vi, Wi) FAL pe (U1, Wo) + Agp (v2, wi) + A2p2 (v2, We). 


A pairing is called perfect if it possesses the properties listed in Proposition 7.2 
below. 


Proposition 7.2 (Perfect Pairing) The following properties of a pairing {*, *) : 
U x W > k between finite-dimensional vector spaces U, W are equivalent: 


(1) For every nonzero u € U, there is some w € W, and for every nonzero w € W, 
there is some u € U, such that (u, w) # 0. 

(2) The map U + W* sending a vector u € U to the covector wt» (u, w) on W 
is an isomorphism. 

(3) The map W — U* sending a vector w € W to the covectorut> (u, w) onU 
is an isomorphism. 


Proof Since (v , w) is bilinear, both maps from (2), (3) are well defined and linear. 
Condition (1) says that the both are injective. Therefore, if (1) holds, then dim U < 
dim W* and dimW < dimU*. Since dimU = dim U* and dimW = dim W*, 
both previous inequalities are equalities. This forces both inclusions (2), (3) to be 
bijective. Thus, (1) implies (2) and (3). Let us show that (2) and (3) are equivalent. 
By symmetry, it is enough to prove that (2) implies (3). Since (2) forces dim U = 
dim W, we have only to check that the map (3) is injective. Let the vector wo € W 
be in the kernel of map (3). This means that (u, wo) = 0 for all u € U. Since by 


8See the comments before formula (6.11) on p. 126. 
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(2), every linear form € : W — k maps &(w) = (ug : w) for appropriate ug € U, we 
conclude that every linear form & € W* vanishes at wo. This forces wo = 0. 


Exercise 7.7 Verify that for every nonzero vector w € W, there is some covector 
— © W* such that &(w) 4 0. 


Therefore, we get the equivalence (2) <=> (3). The implication (2) & (3) => (1) is 
obvious. oO 


Example 7.6 (Contraction Between Vectors and Covectors) Evaluation of a covec- 
tor g € V* at the vector v € V gives a perfect pairing 


(v, 9) £g(v) =ev,(y) (7.6) 


between dual finite-dimensional vector spaces V and V*. It is called a contraction 
of a vector with a covector. The notation (v , g ) emphasizes the symmetry between 
V and V*, and we will often use it in what follows instead of y(v) or evy(@). 


Example 7.7 (2 x 2 Determinant) A perfect pairing of the coordinate plane k? with 
itself is defined by the 2 x 2 determinant? 


det: k’? xk? > k, (vy, v2)  det(v1, v2). 


In particular, each linear form y : k? — k can be written as g(a) = det(by, a), 
where by € k? is uniquely determined by 9. 


7.2 Annihilators 


From this point on, we assume by default that all vector spaces we deal with are 
finite-dimensional. Every set of covectors M C V* can be viewed as a system of 
homogeneous linear equations {§(x) = O}sey in the unknown vector x € V. The 
solution space of this system of equations is denoted by 


Ann(M) © {vu € V| VE € M &(v) = 0} 


and called the annihilator of the set M C V*. Note that Ann(M) C V is always 
a vector subspace, because it is the intersection of vector subspaces ker €& for all 
linear forms € : V — k from M. Dually, for every set of vectors N C V, 
we put Ann(N) = {g € V*| Vu € N g(v) = 0} and call this subspace the 
annihilator of N. Geometrically, Ann(N) consists of all hyperplanes in V containing 
N. Equivalently, one can think of Ann(N) as the solution space of the system of 


°See Example 6.4 on p. 125. 
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homogeneous linear equations {ev,(y) = 0}yew in the unknown covector y € V%*, 
that is, the intersection of hyperplanes Ann(v) C V* taken for all nonzero v € N. 
Of course, Ann(J) is a vector subspace in V* for every set N C V. 


Exercise 7.8 Check that AnnN = AnnspanN for every subset N C V. 


Proposition 7.3. dim U + dim Ann U = dim V for every subspace U C V. 


Proof Let the vectors u1,U2,...,ux be a basis of U and suppose that the vectors 
W1,W2,.--,Wm complete them to a basis in V. Therefore, dim V = k + m. Write 
ok o o ok ok ok 
Up Uz, +. Up, WE» Wo 5002s Wy 


for the dual basis of V*. Then wf, w>,...,w2, € AnnJU, because for every v = 
Yo xii € U, 


wi (v) = wy (xyuy + XQu2 + +++ + xg) = So xis we (ui) =0. 


Since every covector gp = )° yjuy + yw e€ Ann(U) has y; = g(u;) = 0, the 
basis covectors wy, w3,...,w>, span Ann(U) and therefore form a basis in Ann(U). 
Hence, dim Ann(U) = m = dim V — dim U. oO 


Corollary 7.1 Ann Ann(U) = U for every subspace U C V. 


Proof By definition, U C Ann Ann(U). At the same time, Proposition 7.3 implies 
that dim Ann Ann U = dim V* — dim Ann U = dim V* —dimV + dim U = dim U. 
oO 


Corollary 7.2 dimU + dimAnnU = dimV and AnnAnn(U) = U for every 
subspace U C V* as well. 


Proof Apply Proposition 7.3 and Corollary 7.1 to the dual space V* instead of V 
and use the canonical identification V** ~ V. Oo 


Remark 7.1 In the language of linear equations, the relation dim U + dim Ann U = 
dim V means that every vector subspace of codimension m in V can be determined 
by means of a system of m linearly independent linear homogeneous equations. The 
dual relation says that conversely, the solution space of every system of m linearly 
independent linear homogeneous equations has codimension m in V. The equality 
Ann Ann(U) = U means that every linear form @ that vanishes on the solution 
space of linear equations {&(x) = O}zey lies in the linear span of M, i.e., can be 
linearly expressed trough the left-hand sides of the equations. 


Exercise 7.9 Show that Ann Ann WN = spanN for every subset NV C V. 
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Theorem 7.2 The correspondence U < Ann(U) is a bijection between the 
subspaces of complementary! dimensions in V and in V*. This bijection reverses 
the inclusions 


UCW <=> AnnU D AnnW, 


and takes the sums of subspaces to the intersections and conversely: 


Am )0U; = (Ann Uj, Ann{ } U; = y > Ann U;. 


Proof Write S(V) for the set of all vector subspaces in V. The equality 
Ann Ann(U) = U means that the maps sending each subspace to its annihilator 


UrAnnU 


SV) S(V") 


ee 
AnnW aw 


are inverse to each other. Hence, both are bijective. The implication U C W > 
AnnU >} AnnW is obvious from the definition of annihilator. If we apply this 
implication to Ann W, Ann U in the roles of U, W and use equalities Ann Ann W = 
W, AnnAnnU = U, then we get the opposite implication AnnU > AnnW => 
U C W. The equality 


Ann U, = Ann U, (7.7) 
() S 


also follows from the definitions: every linear form annihilating each U, annihilates 
the linear span of all U,, and conversely. If we replace U,, by Ann U,, in (7.7), we get 
the equality 


(\u, = Ann) Ann Uy. 


For the annihilators of the both sides, we get Ann(), W, = >", Ann W,. Oo 
Proposition 7.4 Let V be an arbitrary'' vector space V. For every subspace U C 


V, the restriction map 


ru:V' > U*, gr gly, (7.8) 


OThat is, between k-dimensional and (dimV — k)-dimensional subspaces for each k = 
0, 1,..., dimV. 


' Possibly infinite-dimensional. 


164 7 Duality 


which takes a linear form f : V — k to its restriction onto U C V, has kerry = 
Ann U and induces a well-defined isomorphism V*/ AnnU =» U*, [gy] & ¢lv. 
Every linear form w € Ann U induces a linear form  : V/U — k well defined by 
the assignment 


v (mod U)  yw(v). (7.9) 
The resulting map 
AnnU > (V/U)*, wey, (7.10) 


is an isomorphism of vector spaces too. 


Proof Fix two disjoint sets of vectors E C U and F C V such that E is a basis of 
U and EU F is a basis of V. For every linear form € : U > k, write&: V > k 
for the linear form that sends each basic vector e € E to &(e) and annihilates all 
basic vectors f € F. Then &|y = &. Thus, the linear map (7.8) is surjective. We 
know from Example 6.19 on p. 149 that ry induces a well-defined isomorphism 
v*/AnnU = V*/kerry > imry = U*. This proves the first statement. Further, 
forallu €¢ U,v € V, w € AnnU we have w(v +u) = W(v) + W(u) = W(v). Thus, 
the covector (7.9) is well defined. 


Exercise 7.10 Check that the map (7.10) is linear. 


By Proposition 6.8 on p. 149, the congruence classes [f] = f (mod U) for f € F 
form a basis in V/U. By Lemma 7.1 on p. 155, the covectors € € (V/U)* are in 
bijection with functions F — k, that is, with collections of constants & = &([f]) € 
k. The covectors wy € Ann U also are in bijection with the collections of constants 
wr = W(f) €k, because of y. = w(e) = 0 for all e € EF. Hence, for every covector 
— € (V/U)* there exists a unique covector y € Ann U such that w(f) = &([f]) for 
all f € F. The latter means that w = &. Therefore, the map (7.10) is bijective. 0 


7.3 Dual Linear Maps 


7.3.1 Pullback of Linear Forms 


Associated with every linear map F : U > W is its dual map F* : W* — U* that 
sends the linear form € : W — k to its composition with F, i.e., 


F"(§) = oF :ut> &(F(u)). 
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Note that F* acts in the opposite direction to F’. For this reason it is often called a 
pullback of linear forms from W to U. 


Exercise 7.11 Check that g ° F : U > kis a linear form bilinearly depending on & 
and F. 


Thus, there is well-defined linear dualization map 
Hom(U, V) ~ Hom(W*,U*), Fr F*. (7.11) 


In more symmetric “contraction notation” from Example 7.6 on p. 161, the dual 
maps F and F* are related by the condition 


(vu, F*€) =(Fu,&) VEEW* vEV. (7.12) 


It is obvious from this formula that the canonical identification V** ~ V takes 
F** : y** —>» W** back to F : V — V. Thus, the dualization maps F +> F* and 
F* +> F*™* = F are inverse to each other. In particular, the dualization map (7.11) 
is an isomorphism of vector spaces. 


Exercise 7.12 Show that (F 0 G)* = G* oF*, 


Proposition 7.5 kerF* = AnnimF and imF* = AnnkerF. 

Proof The first equality follows at once from the definition of F* written in the 
form (7.12): 

€é€ AnmnimF => VveV(Fu,&)=0<—VveV(n, F*E) =0 
<=> F*E=0. 

If we write the first equality for F* instead of F and take annihilators of both sides, 
we get the second equality. Oo 
Corollary 7.3 Injectivity (respectively surjectivity) of F is equivalent to surjectivity 


(respectively injectivity) of F*. oO 


Exercise 7.13 Convince yourself that the two maps described in the statements (2), 
(3) of Proposition 7.2 on p. 160 are dual to each other and deduce the equivalence 
(2) ==> (3) from Corollary 7.3. 


7.3.2 Rank of a Matrix 


Consider two finite-dimensional vector spaces U, W and choose two pairs of 
dual bases: u = (uj,U2,...,Um), U* = (uy,u5,...,u;,) in U, U* and w = 
(W1,W2,..-,Wm), W* = (Wy,w35,...,w;,) in W, W*. In Sect. 6.3.2 on p. 136, for 
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every linear map F : U > W we have defined its matrix Fy, = ii)» whose jth 
column is formed by the coordinates f;, 1 < i < m, of the vector F(u;) in the 
basis w. These coordinates come from the linear expansion 


F(uj) = fiywi + fajwo + +++ + finjWn 


and can be computed as contractions fj = ( Fuj, w? ) a (uj, F*w* i The same 
contraction computes the jth coordinate of the vector F*(w;) in the basis u™, i.e., the 


(j, i)th element f;7 of the matrix Fi.,« = (fz) attached to the dual operator F* in 


the dual bases u* and w%*. In other words, the ith row of the matrix F,, coincides 
with the 7th column of the matrix F*. and conversely. This means that the matrices 
Fy and F’,,. are obtained from each other by reflection in the bisector of the upper 
left corner, or equivalently, by interchanging the indices i, j of the matrix elements. 


This operation is called transposition of a matrix. The matrix A’ = (ai) transposed 


to the matrix A = (aj) € Matn x, lies in Matnxm and has ai; = aji. Therefore, the 


matrices of dual operators written in dual pairs of bases are transposes of each other: 
* _ pt 
Fitw* = Fs. ‘i 


Theorem 7.3 For every matrix A € Matnxn(k), the linear span of its rows in kk" 
and the linear span of its columns in \k" have equal dimensions. 


Proof Write F : k” — k” for the linear map having the matrix A in the standard 
bases of k” and k”. Then the linear span of the columns of A is the image of F, 
whereas the linear span of rows is the image of the dual operator F* : k”*—>k"*. 
By Proposition 7.5 and Proposition 6.1, we have dimim F* = dimAnnkerF = 
n— dimker F = dimim F. Oo 


Definition 7.1. The two equal dimensions from Theorem 7.3 are called the rank of 
the matrix A and denoted by rk A. 


Theorem 7.4 (Capelli-Fontené-Frobenius—Kronecker-Rouché Theorem) |” 
The system of linear equations 


AyX1 + Ay2X2 + 2+ + ynXy = D1, 
Ay1X1 + a2X2 + ++ + danXn = bo, 
A31X1 + 32X2 + +++ + 43nX_) = D3, 
Ami X1 + m2X2 + +++ + AmnXn = bm, 


This theorem is known as the Rouché—Fontené theorem in France, the Rouché—Capelli theorem 
in Italy, the Kronecker—Capelli theorem in Russia, and the Rouché—Frobenius theorem in Spain 
and Latin America. 
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is solvable if and only if 


Q\1 Q12 ... Ain a1, Q12... Ain by; 

a2, A22 ... Arn a21 A22 ... Ady bo 
rk} |, = rk 

Gm Gm2 +++ Amn GAm\ GAm2 +++ Amn Dn 


Proof The solvability of the system A(x) = b means that the right-hand-side 
column b lies in the linear span of the columns of the left-hand-side matrix A = 
(aj). This happens if and only if the attachment of the column b to the matrix A does 
not change the linear span of the columns. The latter is equivalent to preservation of 


the dimension of the linear span. 


oO 


Corollary 7.4 The solutions of a system of homogeneous linear equations 


AX, + ay2X2 + +++ + ainXn = 0, 
AX, + 2X2 + +++ + ArnXn = O, 
A31X1 + a32X2 + +++ + a3nXn = 0, 
Ami X1 + Am2X2 + +++ + AmnXn = 0, 


form a vector subspace of codimension rk A in k", where A = (aj) is the matrix of 
coefficients. 


Proof Write F : k" — k” for the linear map with matrix A. Then the solution 
subspace is nothing but ker F, and dimker F = n — dimim F = n —rkA. Oo 


Problems for Independent Solution to Chap. 7 


Problem 7.1 Under the conditions of Example 7.7 on p. 161, find the basis in k? 
dual to the standard basis e; = (1,0), e2 = (0, 1) under the pairing k? x k” > k, 
(a,b) +> det(a,b). Find the vector b, € k? corresponding to the linear form 
(x1, X2) = x; + x2 under this pairing. 

Problem 7.2 Write D = 4 : Q{x] > QJ|x] for the differentiation operator f + f’ 
and consider the polynomial ring!’ Q[D]. Show that a perfect pairing between'* 
Q[D]/ (D"t') and Q[x]<, is well defined by the assignment (®,f) + f(0), 


'3]t is called the ring of linear differential operators of finite order with constant coefficients. 
'4Here Q[D]/(D"t!) means the quotient of the polynomial ring Q[D] by the principal ideal 
spanned by D"*+!. The vector space Q[x]<, can be thought of as the space of solutions of the linear 
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which takes ® € Q[D] and f € Qlx] to the constant term of the polynomial!> &f. 
Find the basis in Q[D]/ (D"+') dual to the monomial basis of Q[x]<, under this 
pairing. Describe the linear endomorphism D* € End (Q[D]/ (D’*')) dual to the 
differentiation D € End (Q|xJ<,). 


Problem 7.3 Show that the matrix A = (aj) € Mat,,x,(k) has rank 1 if and only if 
ay = xiy; for some nonzero (x1, X2,...,Xm) € k” and (1, y2,.--.¥n) € k". 


Problem 7.4 Let the matrix A = (aj) € Matnxn»(k) have ay = x; + y; for some 
xi, yj € k. Show thatrkA < 2. 


Problem 7.5 Given two matrices A,,A2 € Matmnx,(k), let Vj,V2 C k” and 
W,, W2 C k” be the linear spans of their rows and columns respectively. Show 
that the following conditions are equivalent: (a) rk(A; + Az) = rk(A;) + rk(A2), 
(b) Vi 1 V2 = 0, (c) Wi N We = 0. 

Problem 7.6 Show that every matrix of rank r is the sum of at most r matrices of 
rank 1 and give an example of a rank-r matrix that is not representable as a sum 
of fewer than r rank-1 matrices. 


Problem 7.7 (Commutative Triangle) A diagram of maps between sets is called 
commutative if for every pair of sets X, Y on the diagram every pair of chains of 
successive maps joining A with B have equal compositions A — B. For example, 
the commutativity of the triangle diagram 


B (7.13) 


means that y = B ea. Written below are some properties of a commutative 
triangle (7.13) consisting of abelian groups A, B, C and their homomorphisms 
a, B, y. Prove the true properties and disprove the wrong ones by appropriate 
counterexamples. 


(a) If @ and 6 are surjective, then y is surjective. 

(b) If @ and £ are injective, then y is injective. 

(c) If y is surjective, then a is surjective. 

(d) If y is surjective, then f is surjective. 

(e) If y is injective, then a is injective. 

(f) If y is injective, then f is injective. 

(g) If is surjective, then y is surjective if and only if 6 is. 
(h) If is surjective, then y is injective if and only if 6 is. 


differential equation D"*!y = 0 in the unknown function y. Both Q[D]/(D"*') and Q[x]<, are 
considered just vector spaces over Q. 

Recall that Of “ agf + a,Df + a,D?f + --- +a,D"f for ® = aj) +a;D+---+.a,D" (compare 
with Sect. 4.4 on p. 88). 
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(i) If B is surjective, then y is surjective if and only if @ is. 
(j) If B is surjective, then y is injective if and only if @ is. 
(k) If y is bijective, then @ is injective and £ is surjective. 


Problem 7.8 A chain ---—> * > * > * — * >--- of homomorphisms of abelian 
groups is called an exact sequence if keryy = img for every two consecutive 


maps 4 ‘ in the chain. For example, the exactness of the diagrams 
0O—> x an * 

and 
* * > 0 


means that @ is injective and yw is surjective. For an exact sequence of vector 


* 


spaces 0 > U + ¥ us W — 0 show that the dual diagram 0 — W* Laie Vos 


U* — 0 is exact as well. 


Problem 7.9 For the following commutative diagram of abelian groups with exact 
rows, 


0 —— Vv’! ——_+> V ——- v"" ——- 0 


) tf 


0 —> Ww! —+> w —~>w" —~+0 (7.14) 


(a) Show that if gy’ and ” are isomorphisms, then ¢ is an isomorphism too. 
(b) Give a counterexample disproving the opposite implication. 
(c) Show that if g is bijective, then g’ is injective and g” is surjective. 


Problem 7.10 (Five Lemma) For a commutative diagram of abelian groups with 
exact rows 


mm mm. eam ma 


WwW, —— W, —— W, —~> W, —~ W; 


where ¢ is surjective, ws is injective, and both @2, 4 are bijective, prove that p3 
is bijective. 
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Problem 7.11 For a commutative diagram of abelian groups with exact rows 


0) >y! >V >y" > 0 


ke 


0 >Ww' >W >w" >0 


prove that there exists a unique homomorphism g’ : V’ — W’ making the 
left-hand square commutative. Formulate and prove a similar statement for the 
diagram 


O— 7 + fy — v? 0 


Ch 


0 —+> WwW’ —+ Ww —+w" —~+0 


Problem 7.12 (Snake Lemma) Given a homomorphism of abelian groups ¢ : 
U — W, the quotient group coker g = W/ img is called the cokernel of y. For 
the commutative diagram (7.14) with exact rows, construct an exact sequence of 


homomorphisms 


0 > kerg’ > kery — kerg” —> coker y’ > cokerg > cokergy” > 0. 


Problem 7.13 (EKigenvectors and Eigenvalues) Let F : V — V be a linear 
endomorphism of an arbitrary vector space V over a field k. A nonzero vector 
v € Viscalled an eigenvector of F with eigenvalue X € k if F(v) = Av. Prove 
that: 


(a) Every set of eigenvectors with distinct eigenvalues is linearly independent. 
(b) If every nonzero vector in V is an eigenvector, !° then F = A - Idy for some 
Aek. 


Problem 7.14 (Nilpotent Endomorphisms) A linear endomorphism F : V > V 
is called nilpotent if F" = 0 for some n € N. For such F, prove that kerF 4 0, 
F" = 0 for n = dim V, and all eigenvalues of F equal zero. 

Problem 7.15 (Diagonalizable Endomorphisms) A linear endomorphism F' : 
V => Vis called diagonalizable if it has a diagonal matrix!’ in some basis of 
V. For every such F and subspace U C V such that F(U) C U, prove that the 


6Whose eigenvalue may depend on the vector. 
"TA matrix A = (aj) is diagonal if aj = 0 for all i F j. 
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restriction F|y : U — U is diagonalizable too. Show that the following linear 
endomorphisms are not diagonalizable: 


(a) Multiplication by the class [¢] in the factor ring!* k[#]/(t") for n > 2. 
(b) Differentiation D : f > f’ in the space of polynomials of degree at most n for 
n=l. 


'8Considered as a vector space over k. 


Chapter 8 
Matrices 


8.1 Associative Algebras over a Field 


8.1.1 Definition of Associative Algebra 


A vector space A over a field k equipped with a multiplication A x A — A is called 
an algebra over the field k, or just a k-algebra for short, if for every a € A, both 
the left multiplication map A, : A — A, v + av, and the right multiplication 
map Q, : A —> A, v +> va, are linear. This means that multiplication of vectors 
by constants commutes with the algebra multiplication: (Aa)b = A(ab) = a(Ab) 
for all 1 € k and a,b € A, and the standard distributive law holds for addition 
and algebra multiplication: (a; + b))(a2 + b2) = ayaz + aybz + byay + bybo for 
every 41,42, b,,b2 € A. An algebra A is called associative if (ab)c = a(bc) for all 
a,b,c € A, and commutative if ab = ba for all a,b € A. If there exists e € A such 
that ea = ae = a for alla € A, then e is called a unit, and we say that A is an 
algebra with unit. 


Exercise 8.1 For a k-algebra A, verify that the unit e € A is unique (if it exists) and 
0-a=OforallacA. 


Given two k-algebras A, B, every k-linear map g : A — B such that g(ajaz2) = 
(a1) (az) for all a1, az € A is called a homomorphism of k-algebras. 

Basic examples of commutative associative k-algebras are provided by the 
polynomial algebra and its factor algebras.! The main motivating example of a 
noncommutative k-algebra is provided by the algebra of linear endomorphisms of a 
vector space. 


'See Sect. 5.2.4 on p. 109. 
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Example 8.1 (Endomorphism Algebra of a Vector Space) The composition of two 
linear maps G: U > V, F: V — Wiis linear, because 


FG(Au + pw) = F(AG(u) + wG(w)) = AFG(u) + wFG(w). 
Considered as a binary operation, the composition map 
Hom(V, W) x Hom(U, V) ~ Hom(U,W), (F,G)' FG, 


is bilinear: (A; Fy + A2F2)G = A, FG + A2F.G and F(1G, + 2G2) = FG, + 
[42F'G». Thus, all linear endomorphisms of a vector space V over a field k form a 
k-algebra End V = Hom(V, V) with unit e = Idy. It is called the endomorphism 
algebra of V. The endomorphism algebra is associative, because both F(GH) and 
(FG)H map ub F(G(A(u))). 


Exercise 8.2 For a coordinate vector space k”, verify that the n? linear maps” Ej; : 
k” — kk" mapping e; + e; and e, + 0 forall v # j form a basis of End(k”) 
over k. Write the multiplication table of these basis maps and show that End(k”) is 
noncommutative for dim V = 2. 


8.1.2 Invertible Elements 


Given an algebra A with unit e € A, an element a € A is called invertible if there 
exists a! € A such that aa~' = a~!a = e. For an associative algebra A, it is enough 
to demand the existence of a’, a” € A such that a’a = aa" = e. Then automatically 
d =d'e=d (aa") = (d'a)a" = ea” = a". The same computation shows that a7! 
is uniquely determined by a in every associative algebra. 


Example 8.2 (General Linear Group GLV Cc EndV) By Proposition 1.3, the 
invertible elements of End V are the linear isomorphisms V ~ V. They form a 
transformation group’ of V, denoted by GL V and called the general linear group 
of V. 


Compare with Proposition 6.2 on p. 137. 
3See Sect. 1.3.4 on p. 12. 
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8.1.3 Algebraic and Transcendental Elements 


Let A be an associative algebra with unit e over a field k. Associated with each 
element € € A is the evaluation homomorphism* 


eve :k[J >A, fQ) OS®) EA, (8.1) 


which sends a polynomial agx" + a,x"! + +++ + dm—1X+ dm € k[x] to its value for 
x = & calculated within A, that is, to ajg&” + go +-+++ am—1& + ame € A. Note 
that the constant term am = dm -x° evaluates to Ame Om eEcA by definition. 

An element & ¢€ A is called transcendental over k if the evaluation homomor- 
phism (8.1) is injective. Equivalently, the transcendence of € over k means that all 
nonnegative integer powers &” are linearly independent over k. 

An element € ¢€ A is called algebraic over k if the evaluation homomor- 
phism (8.1) has nonzero kernel. In this case, kerevg = (ue) is the principal ideal? 
generated by some monic polynomial yg € k[x], which is uniquely determined 
by & as the monic polynomial of minimal positive degree such that w<¢(&) = 0. 
The polynomial jg is called the minimal polynomial of & over k. Note that every 
polynomial f € k[x] such that f(€) = 0 is divisible by jg. Equivalently, the 
algebraicity of £ over k means the existence of a nontrivial linear relation between 
nonnegative integer powers &’”. In particular, if a k-algebra A is of finite dimension 
over k, then all its elements a € A are algebraic over k. 


Example 8.3 (Algebraicity of Linear Endomorphisms) Since dimy,End(V) = n?, 
the iterations F°, F!, F?,..., F” of a linear endomorphism F € End(V) are 
linearly related. Therefore, F satisfies some polynomial equation of degree at most 
n*. In Sect. 9.6.3 on p. 222 we will show® that F can actually be annihilated by an 
appropriate polynomial of degree n. 


8.2. Matrix Algebras 
8.2.1 Multiplication of Matrices 


Consider a triple of coordinate vector spaces k”, k*, k” and write 


n Ss m 
Uj, U2,...,Un € kK", v1, V2,...,Us € K*, Wwi,wo,...,Wn EK 


“Compare with Sect. 3.4.2 on p. 54. 


>Recall that k[x] is a principal ideal domain. See Sect. 5.3 on p. 109 for the common properties of 
such rings. 


®See also Example 8.4 below. 
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for their standard bases. Assume that linear maps B : k” — k* andA : k* > k” 
have matrices A = (aj) and B = (b,j) in these bases. Then the matrix P = ( Pi) of 
their composition 


P= AB:k" +k" 


is called the product of matrices’ A and B. Thus, we get a product defined for every 
ordered pair of matrices such that the width of the left matrix equals the height of 
the right. The height of the product is the height of the left factor, and the width of 
the product is the width of the right factor. The element pj in the ith row and jth 
column of the product is equal to the coefficient of w; in the expansion 


AB(u;) = A()> vebyj) = SAC dy = ye + Widix Diy - 
k k i k 


Therefore, pj = ajbyj + abo + +++ + aisbs;. This multiplication rule can be 
reformulated in several equivalent ways suitable for different kinds of practical 
computations. First of all, one row and one column of the same size s are 
multiplied as 


(41,42,...,4s)* | . | = aid) + agb2 + +++ + asbs. 
bs 
The result can be viewed either as a linear combination of the a; with coefficients 
b;, or symmetrically as a linear combination of the b; with coefficients a;. For a 
matrix A formed by m rows of size s and a matrix B formed by n columns of size 


s, the product P = AB has m rows and n columns, and the (i, 7) member of P is the 
product of the ith row of A and the jth column of B: 


bij 
bo; 
Pi = (dil, diz, ... , Gis)* : . (8.2) 


bs 


This means that the jth column of AB is a linear combination of s columns’ of A with 
coefficients taken from the jth column of B. For example, if we wish to transform 


Note that the order of multipliers in the product of matrices is the same as in the composition of 
the corresponding linear maps. 


8 Considered as vectors in k”. 
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the matrix 


c= (“ C12 ") (8.3) 


C21 C22 C23 


to the matrix C’ such that 


° the first column of C’ is the sum of the first column of C and the second column 
of C multiplied by A, 

° the second column of C’ is the sum of the first and third columns of C, 

othe third column of C’ is the sum of the third column of C and the second column 
of C multiplied by 2, 

° C” gets an extra fourth column equal to the sum of each of the columns of C 
multiplied by its column number, 


then we have to multiply C from the right side by the matrix 


1101 
A102 
0113 


Exercise 8.3 Verify this by explicit computation of all matrix elements by for- 
mula (8.2). 


Symmetrically, the ith row of AB is a linear combination of s rows” of the matrix B 
with its coefficients taken from the ith row of A. For example, if we would like to 
transform the same matrix (8.3) to the matrix C” such that 


e the first row of C” is the second row of C, 
e the second row of C” is the sum of the second row of C and the first row of C 
multiplied by A, 


then we have to multiply C from the left by the matrix (; i) : 


Exercise 8.4 Verify this in two ways: using the previous description of the columns 
of the product and via straightforward computation of all elements by formula (8.2). 


Comparison of the row-description and the column-description for the product 
AB leads to the conclusion that the transposition! of matrices interacts with the 
multiplication by the rule 


(AB)! = BYA'. (8.4) 


°Considered as vectors in k”. 
'0See Sect. 7.3.2 on p. 165. 
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Therefore, the transposition map C t» C' is an antiautomorphism of matrix 
algebras, meaning that it sends a product to the product taken in reverse order. 


Exercise 8.5 Verify the equality (8.4) by explicit computation of all matrix 
elements by the formula (8.2). 


Since the composition of linear maps is associative and bilinear, the product of 
matrices inherits the same properties, that is, (FG)H = F(GAH) for all F € Mat») x;, 
G € Matix, H € Matey,, and 
(AiFi + H1Gi)(A2F2 + 2G?) 
= AjAoF | Fo + AypoF Go + wi A2Gi F2 + i p2G1 G2 


for all F;, G; € Mat xx (Kk), A;, 4; € k. Hence the square matrices of size n x n form 
an associative k-algebra with unit 


10...0 
pa|?! 
0... 01 


(the ones on the main diagonal are the only nonzero matrix elements). This algebra 
is denoted by 


Mat, (k) & Mat, x(k) ~ End(k") . 


For n = 2, the algebra Mat,(k) is noncommutative. For example, 


12) (30)\_ (7 10 But 30\ /12)\_ /3 6 

03) \45/ \4215 45) \03} \423)° 
Example 8.4 (Annihilating Polynomial of a 2 x 2 Matrix) Let us show that every 
2 x 2 matrix satisfies a quadratic equation: 


F= ab and P= a*+bceab+bd\ _ (a +bce b(a+d) 
~ Ned ~ \ca+dech+d?)~ \c(a+d) ch+@ 


can be combined to 


5 _ (a@+be ba+d)\_ (a(a+d) ba+d) 
Pata P= (4% cn ere ee 


= — ad) 0 


0 os) = (bce —ad)-E. 
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Therefore, F satisfies the equation F* — (a+ b) F + (ad—bc) E = 0. The quantities 
detF“ad—be and trF“%a+b 


are called the determinant"! and trace of the matrix F respectively. In terms of the 
trace and determinant, the quadratic equation on F takes the form 


F’ —tr(F)-F + det(F)-E=0. (8.5) 


8.2.2 Invertible Matrices 


The invertible elements of the matrix algebra Mat,(k) are exactly the matrices of 
linear automorphisms of the coordinate space k”. They form a transformation group 
of k”, denoted by GL, (k) and called the general linear group in dimension n over k. 


Example 8.5 (Invertible 2 x 2 Matrices) It follows from (8.5) that for every F € 
Mat, (k), the equality 


det(F) -E = tr(F)- F— F’ = F- (tr(F) E— F) 


holds. Assume that F = (: ’) is invertible and multiply both sides by F7!: 
c 
det(F)-F~' = tr(F)- E-F = ( ¢ a) ; (8.6) 
—c a 
This forces det F to be nonzero, because otherwise, we would get the zero matrix on 
the left-hand side and therefore a = b = c = d = 0 on the right; this is impossible 


for an invertible map F’. Thus, every invertible matrix F necessarily has det F 4 0, 
and the formula (8.6) gives an explicit expression for F~!: 


-1 
(: ) = (ad — bc)™! ( @ 2) (8.7) 
cd —c a 


Exercise 8.6 Verify this by the straightforward computation of both products FF~! 
and F-'F. 


‘See Example 6.4 on p. 125. 
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8.3. Transition Matrices 


Let a vector v be a linear combination of vectors wy, W2,...,Wm! 


m 


v= So xin = wx, + W2X2 +++ + Winn « (8.8) 


i=1 


If we organize the coefficients x; € k in the column matrix 


x} 

x2 
x=]. (8.9) 

Xm 
and write vectors w; in the row matrix!? w = (w1,w2,...,Wm), then (8.8) turns 
into the matrix equality y = wx, where v is a 1 x 1 matrix with its element in 


V. Such matrix notation makes linear expansions of vectors more compact and 
transparent, especially when we deal with (numbered) collections of vectors such 


as bases and generating systems. For two collections of vectors u = (wu, U2,..., Un) 
and w = (wW1,Ww2,...,Wm) such that each vector u; is linearly expressed through 
(W1,W2,...,Wm) as 
m 
uj = y CyjWy = Wy Cy W2 + Cap ot Wm * Cj 


p=1 


all these expansions are combined into the single matrix equality u = w-C,,, where 
the matrix 


Ci1 C12 +++ Cin 
C21 C22 --+ C2n 


Cm =(cy)=|] 0... (8.10) 
Cm1 Cm2 +++ Cmn 


is constructed from u via replacement of each vector u; by the column of coefficients 
belonging to its linear expression through (w,,w2,...,Wm). The matrix (8.10) is 
called the transition matrix from u to w. A particular example of the transition 
matrix x = C,, is the column (8.9) formed by the coefficients of the linear 
expression of one vector v through the vectors wu. 


Note that the elements of this matrix are vectors. 
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If vectors v = (v,,U2,...,U,;) are linearly expressed through vectors u as v = 
uC,,, then the transition matrix C,, allows us to express the vectors v through w 
as Vv = WCyy,Cyy. Therefore, the transition matrix from v to w is the product of the 
transition matrices from wu to w and from v to uw: 


CyuCuv = Cyr - (8.11) 


Remark 8.1 If vectors e = (é1,é2,...,@n) are linearly independent, then the 
transition matrix C,y from any collection of vectors w = (w1,W2,...,Wm) to e 
is uniquely determined by e and w, meaning that two collections of vectors u, w 
coincide if and only if Cz, = Cey. If there are some nontrivial linear relations 
between vectors w = (Ww ,W2,...,Wm), then every vector v in the linear span of 
w allows many different linear expressions!* through w. Therefore, the notation 
Cy» is not correct in this case, because the matrix Cy, is not uniquely determined 
by w and v: different matrices may produce the same collection v. Nevertheless, 
equality (8.11) is still intentional, and says that given some linear expressions Cyy, 
Cyy of vectors u, v through the vectors v, w respectively, their matrix product 
Cwu - Cuy gives some linear expression Cy, of u through w. 


Lemma 8.1 Let a collection of vectors v = (V1, V2,...,Un) be a basis of V. Then 
the collection u = vC,, is a basis of V if and only if the transition matrix Cyy is 
invertible. In this case, ora = Ci: 


Proof Yu is a basis, then the vectors e are linearly expressed through w. It follows 
from (8.11) that Cee = CenCue and Cyn = CyueCey. Since the transition matrix froma 
collection of vectors to a basis is uniquely determined by the collection, we conclude 
that Cee = Cy, = E. Hence, the transition matrices Cy, and C,, are inverse to each 
other. If vectors uw are linearly related, say uA = 0 for some nonzero column of 
constants A, then eC,,A = 0. Hence, C,,A = 0. This matrix equality prevents Cy, 
from being invertible, because otherwise, multiplication of both sides by C;,’ would 
give A = 0. Oo 


Example 8.6 (Basis Change Effect on Coordinates of Vectors) If the vectors w = 
(W1,W2,...,Wm) are expressed through the basis e = (€1, €2,...,@n) ASW = eCoy 
and the vectors y = eC,, form another basis, then the transition matrix from w to v is 


Cw = Cre Cow = C Cw . 


In particular, the coordinate columns x¢, x, of the same vector v € V in the bases e, 
_ —~ cl 
v are related as x, = Cye + Xe = Ch + Xe. 


Example 8.7 (Basis Change Effect on Matrices of Linear Maps) For every linear 
map F : U + W and collection of vectors v = (v1, U2,..., U;,), let us write F(v) for 


'3In Sect. 8.4 we have seen that these linear expansions form an affine subspace in k” parallel to 
the vector subspace of all linear relations on w), w2,..., Wmn- 
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the collection (F(v,), F(v2), ... , F(v;)) considered as a row matrix with elements 
in W. Since F is linear, it follows that for every matrix M € Mat,,;(Kk), the equality 
F(vM) = F(v)M holds. 


Exercise 8.7 Verify this equality. 


For every pair of bases u, w in U, W, the matrix F,,,, of a linear map F in these bases 
is defined by the assignment!* F(u) = wF,,,. For another pair of bases # = uC,g 
and w = wC,,,, the matrix of F is 


Fog = CypFwuCuii , (8.12) 


because 
F(a) = FuCyai) = FU) Cua = 0 FwuCua = 0 Cow wu Cui = W CFF ya Cus. 


In particular, if a linear endomorphism F : V — V has the matrix F, # F.. in some 
basis e, then for another basis u = eC,,, the matrix of F will be 


| eee OD OM ae (8.13) 


8.4 Gaussian Elimination 


Gaussian elimination simplifies a rectangular matrix while preserving its rank. This 
allows one to construct an explicit basis in a vector space given as the linear span 
of some vectors, or as the solution of systems of linear equations, or as a quotient 
space, etc. Gaussian elimination is the main computational tool in linear algebra. In 
some sense, it generalizes the Euclidean division algorithm. 


8.4.1 Elimination by Row Operations 


In this section we work in the coordinate space k”, whose vectors will be the rows. 
For every collection of such vectors 


Ww) = (Wi, Wiz, --. 5 Win)s 
w2>= (w21, W225 +++ 5 Wan); 

(8.14) 
We = (Wit, Wha, ++ > When); 


4See formula (6.20) on p. 136. 
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Gauss’s method produces another collection of vectors uw, u2,...,u, Such that 
span(u),U2,...,U,) = span(w1, Wo,..., We), 
and the coordinates of uj, u2,..., U, Written in rows, 
Uj, Uj2... Uin 
U2) U22 ... U2n 
; 
Uy] Uy2 ... Urn 


form a reduced echelon matrix, meaning that the leftmost nonzero element of each 
row: 


e is placed strictly to the right of the leftmost nonzero element of the previous row, 
¢ is equal to 1, 
e is the only nonzero element of its column. 


A typical example of a reduced echelon matrix looks like this: 


O1lxO0*x«0*«0x 
O0O01**0*0x 
O0OD00001 «0% ]’ 
000000001 x 


where asterisks indicate arbitrary constants. We write j, for the number of the 
column containing the leftmost nonzero element of the vth row. The resulting 
strictly increasing sequence of numbers J = (jy,,jy,,---,Jv,) is called the shape 
of the echelon matrix. The echelon matrix in the above example has shape J = 
(2,4, 7, 9). 


Exercise 8.8 Verify that the nonzero rows of a reduced echelon matrix are linearly 
independent and thus form a basis of their linear span. 


The reduction process splits into a sequence of elementary steps. In each step we 
take a collection of vectors v,, v2,..., vg and replace some pair of vectors v;, vj by 
their linear combinations v; = av; + bu;, v; = cv; + dv; such that span(vj, v;) = 
span(v;, vj). The latter property certainly holds for the following three elementary 
row operations: 


~ 


(1) vi =u; + Ay, v; = Y (with any A € k), 


(2) v=), ee 


Ss 


~ 


(3) UV; = QU;, Nos ys (with nonzero @ € k), 
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because the original v;, v; are linearly expressed through v;, U; as 


(1)! Vj =U, — yy, i= U;, 

—1 — acl eer; 
(2) Uj = U;, vj = U;, 
(3)! iso e, Uj = U;. 


The effect of these operations on the whole coordinate matrix (vj) of vectors 0; 
consists in 


(1) replacement of the ith row by its sum with the jth row multiplied by some 
A €k, 
(2) interchanging the ith and jth rows, 


(3) multiplication of the ith row by some nonzerog € k. 
(8.15) 


Lemma 8.2. Each matrix A € Matmxn(kK) can be transformed to some reduced 
echelon matrix by means of a finite sequence of elementary row operations. 


Proof We split the reduction procedure into n steps, where n is a number of 
columns. Assume inductively that after the (k — 1)th step, a submatrix formed by 
the left k — 1 columns is in reduced echelon form!> and has s nonzero rows. Note 
that O < s < k—1 and the (m-—s) x (k — 1) submatrix situated in the left-hand 
bottom corner is filled with zeros. At the kth step, we choose some nonzero element 
a situated in the kth column strictly below the sth row. If there is no such element, 
we can pass to the (k + 1)th step. If such an element a exists and is in the fth row, 
we multiply this row by a~!. Then, if t 4 s + 1, we interchange the (s + 1)th and 
tth rows. Thus, we get | at position (s + 1,k) and preserve the reduced echelon 
form of the submatrix formed by the leftmost k — 1 columns. Finally, we annihilate 
all nonzero elements of the kth column except for the unit in the (s + 1)th row 
by adding appropriate multiples of the (s + 1)th row to all rows containing those 
nonzero elements. After that, we can proceed to the (k + 1)th step. Oo 


Example 8.8 (How It Works) Let us transform the matrix 


Ja1 802A 
—i- 1 30: 4 

A=), 3 5_1| ¢Maes@ (8.16) 
=f. Oo 2a 7 


'For k = | this means nothing. 
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to reduced echelon form. Multiply the bottom row by —1, and then swap it with the 
first row: 


1 0-2-1-1 
-l 13 0 1 
-1-1 1 2-1] 
2-4-8 2-4 


then eliminate the first column below the first row by adding to the second, third, 
and fourth rows the first row multiplied by 1, 1, and —2 respectively: 


{ 0=2+1=1 
Of 1 -t=1 Oy. 
O-1=1 1-7) 
O-4+4 4-2 


then eliminate the second column below the second row by adding the second row 
multiplied by 1 and by 4 to the third and fourth rows: 


10-2 -1-1 
Oo1 1-1 0 
00 0 0-2]? 
00 0 0-2 


then divide the third row by —2 and eliminate the last column outside the third row 
by adding appropriate multiples of the third row to the first and fourth rows: 


10-2-10 
01 1-10 
00 0 01 ote 
00 0 00 


We have obtained a reduced echelon matrix of shape (1, 2,5). The result shows that 
rkA = 3. 


Example 8.9 (Basis in the Linear Span of Given Vectors) Since the elementary row 
operations do not change the linear span of the rows, the nonzero rows of the 
resulting reduced echelon matrix form a basis in the linear span of rows of the 
original matrix.'© For example, the computation from Example 8.8 shows that a 


16See Exercise 8.8 on p. 183. 
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subspace U Cc Q° spanned by the rows of the matrix 


2-4-8 2-4 
-1 1 30 1 
-l1-1 12-1 ‘pae) 
-1 0 21 1 
has dimension 3, and the rows of the matrix 
10-2-10 
01 1-10 (8.19) 
00 0 O01 


form a basis in U. 


Proposition 8.1 For every r-dimensional subspace U C k", the standard basis of 


k” can be decomposed into two disjoint sets {@;,, €i,,..., €i,_,} U{Ej,,@n5--- Gt = 
{€1, €2,...,€n} such that the complementary coordinate subspaces 
E; => span(e;, s Giggs ees éi,_,.) ~k"’ and Ey = span(e;, sCjgseees e;,) ~k’ 


satisfy the following mutually equivalent conditions: 


(1) UNE; =0, 
(2) the quotient map x : k" —» k"/U is restricted to the isomorphism m|z, : 
E, > k"/U, 


(3) the projection p :k" —> Ej, (x1,x2,....%n) > (Xj. Xp.---.%,), Of kK” onto Ej 
along Ey is restricted to the isomorphism p|y : U > Ej, 
(4) there are r vectors u,U2,...,U, € U of the form uy, = e;, +Wy, where w, © E}. 


For every such decomposition k" = E; ® Ej, the vectors uy in (4) form a basis of U 
and are uniquely determined by U and the decomposition. 


Proof For every basis w1,w2,...,w, of U, the exchange lemma!” allows us to 
replace some r vectors @;,, €j,, ..., @;, of the standard basis in k” by vectors w, 
in such a way that 


i 


W1,W2,-++,Wry Cys Ciny +++ Cin, 


is a basis in k”. This forces Ey = span(e;,, e;,,...,@;,_,) to satisfy condition (1). 
Now let us show that conditions (1)-(4) are equivalent. Condition (1) implies that 
UN kerp = E;  kerx = 0. Hence both restricted maps p|y : U — E; and 
|p, : Ey > k"/U are injective. Since the source and target spaces in both cases 


See Lemma 6.2 on p. 132. 
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have equal dimensions, both restricted maps are isomorphisms. Thus, (1) implies 
(2) and (3). Conversely, condition (2) (respectively (3)) implies the transversality 
of U (respectively of E;) with ker z (respectively with kerp). Such transversality 
is equivalent to (1). Condition (4) says that p(u,) = e;,. If it holds, then p is 
an isomorphism. Conversely, if p is an isomorphism, there exists a unique basis 
Uy, Uz,...,u, € U such that p(u,) = e;, for each v. oO 


Remark 8.2 There are r-dimensional subspaces U C k” transversal to all (n — r)- 
dimensional coordinate subspaces E;. Moreover, over an infinite ground field, a 
“typical” subspace is exactly of this sort. Thus for generic U, conditions (1)—(4) hold 
for many decompositions k” = E; ® E,, often for all. Gauss’s method indicates one 
of these decompositions, those with lexicographically minimal'® J, and explicitly 
computes the corresponding vectors uj, u2,...,uU, in (4). The latter feature is the 
main purpose of Gauss’s method. 


Exercise 8.9 Let A € Mat; ,,(k) be an arbitrary matrix, U C k” the linear span of 
its rows, J = (ji, jo,...,Jx) any shape satisfying 1 < jy < jo < +++ < je <n. 
Show that U is isomorphically projected onto the coordinate subspace E; along the 
complementary coordinate subspace E, if and only if the kr submatrix of A formed 
by the columns j, j2,...,j, has rank r. 


Example 8.10 (Basis of a Quotient Space) Let R be a reduced echelon matrix of 
shape!? J. Write U for the linear span of its rows. Then the rows of R surely satisfy 
condition (4) of Proposition 8.1. Therefore, U is isomorphically projected onto the 
coordinate subspace E, along the complementary coordinate subspace E;, whereas 
the mod U congruence classes of the standard basic vectors e; € E; form a basis of 
the quotient space k”/U. 

For example, one more consequence of the computation made in (8.19) is that 
the classes of vectors e; = (0,0, 1, 0,0) and eg = (0,0, 0, 1,0) form a basis in the 
quotient space Q°/U, where U C Q?° is the subspace spanned by the rows of the 
matrix 


9499 =A 
=f 1.30.4 

A= 2 
afi 121 oe) 
=f 0 2 4 


'8See Sect. 8.4.2 on p. 190 below. 


‘Recall that the shape of an r X n echelon matrix is the increasing sequence of numbers J = 
(iv, vy +++>dv,) of the columns in which can be found the leftmost nonzero elements of the rows. 
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whose reduced echelon form is 


10-—2-10 
O01 1-10 

R= : 21 
00 0 O1 ee) 
00 0 00 


Example 8.11 (Solution of a Homogeneous System of Linear Equations) Associ- 
ated with the matrix (8.20) is also the system of homogeneous linear equations 


2x, —4%) — 843 +2x4-4x5 = 0, 


xX, +X. +3434 x5 = 0, 
(8.22) 
=x) —X2 $43 4+2x4-x5 = 0, 


—x,} +223 +%4+%x5=0. 


Its solution space is the annihilator Ann(U) C Q®” of the linear subspace Ann U C 
Q spanned by the rows of the matrix A. If the linear form £ € Q>” has coordinates 
(x1,X2,...,2%5) in the standard basis e+, e3,...,e% of Q* dual to the standard basis 
€1,€2,...,é@5 in Q”, then the value of € applied to the vector a = (a, 02,..., 05) € 
Q is E(a) = ayx, + aoxy + +++ + as5x5. Therefore, equations (8.22) say that & 
annihilates four vectors spanning U, that is, it annihilates U. Choosing another 
spanning system for U changes neither U nor Ann U. Thus, the system of equations 
with matrix (8.21), 


xX, —2%x3-x4 = 0, 
Xo + x3—-—x4 = 0, (8.23) 
x5 = 0, 
is equivalent to the initial system (8.22). The system (8.23) has separable unknowns, 
that is, it can be rewritten in the form 
Xx) = 2x34 x4, 
XQ = —X3 + Xa, (8.24) 
x5 = 0, 
where the unknowns x,, x2, x5 are expressed through the complementary unknowns 
x3, X4. The left-hand-side unknowns, whose numbers form the shape of the reduced 


echelon matrix (8.21), are called dependent. The complementary unknowns are 


called free. They can take any values within Q’ = Es. 4) and for each (x3, x4) € Q?, 


2K 
(1,2,5) 


(X,,X2,...,xX%5) € @ * solves the system. Algebraically, x, x2, x5 are computed 


there exists a unique triple (x1, x2, x5) € Q@=E such that the total collection 
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from given x3, x4 by formulas (8.24). Geometrically, the projection along the 
coordinate space Eti25) = span (ef, e3, ex) Cc Q°* onto the complementary 
coordinate plane EG, 4) = Span (e5. ei) gives an isomorphism Ann U => EG, 4): 
Therefore, there exists a unique basis in Ann(U) projected to the standard basis 
e3,e, € EG): It should consist of two forms looking like g, = (*, *, 1,0, *), 
2 = (*, *, 0, 1, *). The coordinates marked by asterisks are easily determined 
from (8.24), which gives g, = (2,—1,1,0,0), g@ = (1,1,0,1,0). Thus, the 


solution space of the system (8.22) consists of all 
(x1,X2,---,%5) = Aig + Arp2 = (2A, + Az, Ar +A2,A1,42,0), 


where (A, A2) runs through Q?. Every system with reduced echelon matrix of shape 
J can be solved in this way with the dependent unknowns numbered by J and the 
free unknowns numbered by the indices J complementary to J. 


Example 8.12 (Solution of an Arbitrary System of Linear Equations) Given an 
arbitrary system of linear equations 


Ay X, + Ay2XQ + +++ + AyXy = D1, 

AX, + Ag2X2 + +++ + AynXp = bo, 

A31X1 + 32% + +++ + A3nX_ = D3, (8.25) 
Ami X1 + Am2X2 + +++ + AmnXn = bin > 


write A = (ai) for the m xX n matrix of the coefficients on the left-hand side and 
A= for the m x (n + 1) matrix constructed from A by attaching the right- 


hand-side column b to the right-hand side of A. The matrix Ais called the augmented 
matrix of the system (8.25). In the language of equations, three elementary row 
operations (8.15) are applied to A: 


(1) adda multiple of the jth equation to the ith equation, 


(2) swap the ith and jth equations, (8.26) 


(3) multiply both sides of the ith equation by an invertible constant . 


Exercise 8.10 Verify that these transformations take a system to an equivalent 
system, namely one that has the same solution set. 


Thus, Gauss’s method reduces a system of the form (8.25) to an equivalent system 
with reduced echelon augmented matrix R= € Mat,x(n41)(K), where r = 
tkR = rkA, R = (aj) € Mat,,(k). Such a reduced echelon system has separable 
unknowns and can be solved exactly in the same way as in the previous example. 
Namely, let J = (jy,,j..,--->Jv,) be the shape of R. If j- = n+ 1, then the rth 


190 8 Matrices 


equation is 0 = 1, and the system is inconsistent. Note that in this case, 
tkA = rkR=r—14r=rkR =rkA, 
and this agrees with the Capelli-Fontené—Frobenius—Kronecker—Rouché theorem, 


Theorem 7.4 on p. 166. 7 
If j, <n, thenrkR = rkR = r, and the dependent unknowns x; with j € J are 


expressed through the complementary unknowns x; € J = {1, 2, ..., n}~Jas 

Xjy = Bi — Oni Xi, — CigXig — ++ — Cig —_ Xin 

Xjp = Bo — O24, Xi, — WaigXig — +++ — CDi, -Xin—m> 

(8.27) 

Xj, = Br — Otyi Xi, — ObvigXig — 0+ + — Cry Kinin 
This gives a parametric representation of all solutions: take any (X;,,Xi,,.--5Xi,_,) € 
k”"" = E; and compute the remaining x; by the formulas (8.27). For example, taking 
all free unknowns to be zero, we get the solution x;, = By, Xj, = 0. 


Example 8.13 (Graphs of Linear Maps) Let a linear subspace U C k” and the 
decomposition k” = E; @ E; satisfy the conditions of Proposition 8.1. Then U 
can be viewed as the graph of the linear map fy = p; ° Py : Ey — E;, where 
pr: U — E; and p; : U = E; are the projections. For every v € E;, the vector 
fu(v) € E; is uniquely determined by the prescription v + fy(v) € U, because 


the projection p; : U = E; is bijective. Let the rows uy, u2,...,u, of the reduced 
echelon matrix R form a basis of U. Then pF (e;,) = uy, and therefore fy (e;,) is the 
vth row of the submatrix R; C R formed by the columns i,,, i,,, ..., i,,_,- In other 


words, the matrix of the linear map fy in the standard bases is Rj, the transposed 
submatrix R; C R formed by rows indexed by /. 


8.4.2 Location of a Subspace with Respect to a Basis 


In this section we show that the reduced echelon matrix A;eq obtained from a 
given matrix A by means of Gaussian elimination is predicted by the subspace 
U Cc k” spanned by the rows of A and does not depend on the particular sequence 
of elementary row operations used in the reduction procedure. Let us write V! = 
span (€;41, €;+2, .-- , €n) for the linear span of the last n — i standard basis vectors 
in k". The coordinate subspaces V' form a decreasing chain 


KP=VovlaoWo-- av !lov=0, (8.28) 
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called a complete coordinate flag. Let 1; : V° —» V°/V! denote the quotient 
map. Roughly speaking, z;(v) is “v considered up to the last (n — i) coordinates.” 
Associated with every r-dimensional vector subspace U C V is a collection of 
nonnegative integers 


d;# dima,(U) = r—dimkerz;|y =r—dimUNV', O0<i<n. 


The numbers do, di,...,d, form a nondecreasing sequence beginning with dy) = 0 
and ending with d, = r. All the increments d; — dj-; are less than or equal to 1, 
because UM V’ is contained in the linear span of UM V‘*! and e;, the dimension 
of which is at most | greater than dim(U N vitly. Write J = (ji,j2,..-,J/r) for the 
collection of the r indices for which dj, — dj,-1 = 1 and call it the combinatorial 
type of the subspace U with respect to the coordinate flag (8.28). For example, the 
subspace U C Q° spanned by the rows of echelon matrix 


10230 
01450 (8.29) 
00001 


has dimensions (do, d, ..., ds) = (0, 1,2, 2,2,3) and has combinatorial type J = 
(1, 2,5). 


Exercise 8.11 Let U C k" be spanned by the rows of the reduced echelon matrix 
R € Mat,.,,(k). Show that the combinatorial type of U coincides with the shape of R. 


Therefore, the shape of A;eq depends only on the linear span U of the rows of 
A. A subspace U of combinatorial type J satisfies Proposition 8.1 on p. 186 for 


the decomposition k" = E; @ E;, where J = {1,2,...,n}~ J. Hence, by 
Proposition 8.1, the rows of Area, which form the basis 1, u2,...,u, € U projected 
along E; to the standard basis e;,, e;,, ..., ej, in Ej, are uniquely determined by U 


and the decomposition k” = E; @ E;. We conclude that the reduced echelon matrix 
Area depends only on U C k”. We summarize the discussion as follows. 


Proposition 8.2 Every subspace U C k" admits a unique basis uy, u2,...,Uy such 
that the coordinates of the basis vectors written in rows form a reduced echelon 
matrix My. The assignment U +» My establishes a bijection between the r- 
dimensional subspaces U C k" and the reduced echelon matrices of width n with 
exactly r nonzero rows. oO 


Exercise 8.12 Show that the reduced echelon matrices of shape (ji, j2,...,j,) form 
an affine subspace of dimension r(n — r) — 7), iv — v + 1) in Matyxn(k). 
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8.4.3 Gaussian Method for Inverting Matrices 


We know from the examples appearing after formula (8.3) on p.177 that each 
elementary row operation (8.15) on m x n matrices is realized as left multiplication 
by an appropriate m x m matrix L, which depends only on the operation but not on 
the matrix to which the operation is applied. In other words, each row operation is a 
map of type 


Matnxn(K) > Matnxn(k), Ate LA. 


Exercise 8.13 Verify that the matrix L realizing a row operation is equal to the 
result of this operation applied to the m x m identity matrix E. 


As we have seen before (8.15), for every elementary row operation, there is an 
inverse row operation that recovers the original matrix from the transformed one. 
Write L’ for the m x m matrix multiplication by that realizes the row operation 
inverse to that provided by the left multiplication by a matrix L. Thus, L’LA = A 
for every A € Mat, (kK). Similarly, there exists a matrix L” such that L’L’A = A 
for all A. Taking A = E, we get L’L’ = L’L’E = E = L’LE = UL’L. Hence, 
L"” = L = (L’)~!. We conclude that each elementary row operation is realized as 
left multiplication by an invertible m x m matrix. 


Exercise 8.14 Verify that the product of invertible matrices is invertible and 


(gigi Sig LE I4 5 


Let A be a square nxn matrix. As we have just seen, every matrix B obtained from 
A by elementary row operations can be written as B = LA for an invertible matrix 
L constructed from the identity matrix EF by the same sequence of row operations 
that creates B from A. If B has a reduced echelon form, then either B = E or the 
bottom row in B vanishes. In the second case, rkB = rkA < _ n, and therefore 
both matrices B, A are noninvertible, because the linear endomorphisms k” — k” 
given by these matrices in the standard basis of k” are not surjective. In the first 
case, LA = E for some invertible matrix L. Therefore A = L7! is invertible and 
A~! = Lis constructed from E by the same sequence of elementary row operations 
that constructs E from A. 

In other words, to check whether a given matrix A € Mat,(k) is invertible, we 
can proceed as follows. Form the n x 2n matrix by attaching the identity 


matrix E to the right of A. Use Gaussian elimination to transform to the 
reduced echelon matrix If during this transformation, the left nxn half matrix 
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becomes noninvertible,2° then A is not invertible. If the resulting echelon matrix has 
B=E,thenC =A. 


Example 8.14 Let us analyze whether the matrix 


63-2 1 
ez 14 1 1 
11 3-1 
-10-2 1 


is invertible, and if it is, compute the inverse by Gaussian row elimination in the 
extended matrix 


63-2 1/1000 
14 1 1/0100 
11 3-1 0010 
-10-2 1/0001 


Change the sign of the bottom row and swap the top row with the bottom: 


10 2-1 000-1 
14 1 1/010 0 
11 3-1 001 0 
63-2 1)100 0 


Annihilate the first column below the first row by adding appropriate multiples of 
the first row to all other rows: 


10 2-1 |000-1 
04 -1 2/010 1 
Ol 1 0/001 1 
03-14 7/100 6 


Swap the two middle rows and annihilate the second column below the second row: 


10 2-1)\00 0-1 
Ol 1 0/00 1 1 
00 -—5 2 |01-—4-3 
00-17 7/)10-3 3 


(8.30) 


For example, if one row becomes proportional to another. 
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Now, to avoid fractions, let us digress from the classical procedure and transform 
the two bottom rows by means of the invertible matrix”! 


=o) _ 7 72 
=177) ~~ \=17 5)" 


that is, multiply the whole 4 x 4 matrix from the left by the matrix 


10 OO 
01 00 
00 -72 
00-175 


Exercise 8.15 Verify that this matrix is invertible. 


We get 


102-1)|0 O 0-1 
011 0\0 O01 1 
001 0|2 —722 27 
000 1 |5—1753 66 


Finally, subtract the third row from the second and fourth rows and twice the third 
row from the first: 


1000; 1-3 9 II 
0100 |—2 7-21 —26 
0010; 2 -—7 22 27 
0001); 5-17 53 66 


We conclude that the matrix A is invertible and 


1-3 9 1 
gai |=? T2120 
4 =f OF 97 
5-17 53 66 


21 See formula (8.7) on p. 179. 
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Example 8.15 (Solution of a System of Linear Equations Revisited) Using matrix 
notation, we can write a system of n linear equations in n unknowns, 


AX, + ay2X2 + +++ + 1nXy = D1, 
A21X1 + A22X2 + +++ + ArnX_ = bo, 
A31X1 + a32X2 + +++ +. A3nX_n = bs, 
Ani X1 + An2X2 + ++ + AnnXn = bn, 


as a single linear equation Ax = b in the unknown x | matrix x, where A = (aj) 
and b = (b;) are given n x n andn x 1 matrices. If A is invertible, this equation 
has a unique solution x = A~'Db. It can be found via Gaussian elimination: if we 
transform the augmented matrix to reduced echelon form then we 
get c = A_'b. If we need to solve several systems of equations having the same 
coefficient matrix A and different right-hand sides b, it may be more productive to 
compute A~! separately and then calculate the solutions as x = A~'b. 


8.5 Matrices over Noncommutative Rings 


If we remove the commutativity of multiplication” from the definition of a com- 
mutative ring,”* then we get what is called just a ring, i.e., an additive abelian group 
R equipped with associative and two-sided distributive multiplication R x R— R. 
Associativity and distributivity mean that a(bc) = (ab)c, a(b + c) = ab + ac, and 
(a+ b)c = ac + be for all a,b,c € R. If there is some element e € R such that 
ef = fe =f for all f € R, then e is called a unit, and we say that R is a ring with 
unit. 


Exercise 8.16 For every ring R, show that (a) 0- f = 0 for all f € R, (b) a unit is 
unique if it exists. 


Every associative algebra is a ring. In particular, the endomorphism algebra End(V) 
of a vector space V and the matrix algebra Mat, (k) are rings. The latter example can 
be extended by replacing the field k with any ring R, not necessarily commutative. 
Then the sum S = F + G and product P = FG of matrices F = (fj) and G = (gj) 


See formula (2.5) on p. 19. 
3See Sect. 2.1.2 on p.21. 
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are defined by the same rules, 
si =f + Bj and py =D fivdv. 
v 


as for matrices over commutative rings. 


Exercise 8.17 Check associativity and both distributivity properties of these oper- 
ations. 


The ring of square n x n matrices with elements in a ring R is denoted by Mat, (R) 
and called the order-n matrix algebra over R. One should be careful when making 
computations with matrices whose elements do not commute and are not invertible. 
For example, formula (8.5) on p. 179 becomes incorrect over a noncommutative 
ring R, because we permuted matrix element factors in the products when we 
took the common factor (a + d) out of the secondary diagonal of the matrix. 
Formula (8.7) for the inverse matrix also fails over a noncommutative ring. Even 
over a commutative ring that is not a field, the invertibility criterion should be 
formulated more accurately: a 2 x 2 matrix F over a commutative ring is invertible 
if and only if det F is invertible, and in this case, formula (8.7) for the inverse matrix 
holds. 


Exercise 8.18 Prove the latter statement. 


Example 8.16 (Some Invertible 2 x 2 Matrices) Over an arbitrary ring R, an upper 
triangular matrix 

ab 

Od 


is invertible if and only if both diagonal elements a, d are invertible in R. Indeed, 


the equality 

ab\ (xy\ _ faxt+bzay+bw\ _ (10 

OdJ)\zw) \ az dw }) \O1 
leads to dw = 1 and dz = 0. Thus, d is invertible, w = d~!, and z = 0. This gives 
ax = 1 and ay + bd! = 0 in the upper row. Hence, x = a~!, y = —a_'bd™, and 


ap) _ (a -a7'ba"! 
Od “10 d! . 


Similar arguments show that a lower triangular matrix 


@ 
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is invertible if and only if a, d are both invertible, and in this case, 


= 


a0 - a! 0 
cd ~ \-dolea7! da} 


Exercise 8.19 Show that matrices (: ? and (° °) are invertible if and only if c, 
c c 


b are both invertible, and in this case, 


ab\' (0 ang (2%) = (werd 
cOJ  — \b! -bac7! Case «he the OY 


The above calculations show that Gaussian elementary row operations of the first 
type, which replace a row by its sum with any multiple of another row, are realized 
by left multiplication by invertible matrices even over noncommutative rings. 


Example 8.17 (Unitriangular Matrices) The two diagonals beginning at the upper 
left and right corners of a square matrix 


are called the main and secondary diagonals respectively. A square matrix A = 
(ai) is called upper (respectively lower) triangular if all matrix elements below 
(respectively above) the main diagonal vanish, iec., if aj = O for alli > j 
(respectively for all i < j). Over a ring with unit, a triangular matrix is called 
unitriangular if all the diagonal elements equal 1. 


Exercise 8.20 Check that upper and lower triangular and unitriangular matrices 
form subrings in Mat, (R). 


We are going to show that every upper unitriangular matrix A = (ai) over an 
arbitrary ring with unit is invertible and that the elements b, of the inverse matrix 
B = A~ are given by the formula 


jri-l 


bi = pk (-1)*t y Biv, Avy v24vyv3 °° * Avs—tvs gj (8.31) 


s=0 i<vj<i<vs<j 


i<k<j i<k<l<j i<k<l<m<j 
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We proceed by Gaussian elimination. For a 4 x 4 matrix 


1 ay 13 14 
O 1 ax3 ar4 
0 0 1 a34 
0 0 0 1 


A= 


the computation is as follows. Attach the identity matrix from the right: 


1 ay2 a3 a44\1000 
0 1 a3 ay4|0 1 00 
0 0 1az4)/0010 
0 0 0 10001 


Annihilate the second column over the main diagonal by adding the appropriate 
multiple of the second row to the first: 


1 0 43 — 442423 44 — 42424 |1 —a12 00 


01 a3 ar |0 100 
00 1 ay |0 O10 
00 0 1/0 OO1 


Annihilate the third column over the main diagonal by adding appropriate multiples 
of the third row to the first and the second rows: 


1 00 ay4 — a)2G24 — 13034 + 412423034 |1 —ay2 —a13 + 412473 0 


010 a24 — A23034 0 1 —ax 0 
001 a34 |0 0 10 
000 1 0 0 01 


Finally, annihilate the last column over the main diagonal by adding appropriate 
multiples of the last row to all the other rows: 


1 —ay2 —ay3 + 12423 —A 14 + G12424 + 13434 — 412423034 


At = 0 1 —a23 —d4 + 473034 
0 0 1 —a23 
0 0 0 1 
To explain the general case, let us mark n distinct points 1, 2, ... , nm ona plane 


and depict a matrix element aj by an arrow drawn from j to i. We think of right 
multiplication by aj as a passage from i to j along this arrow. Then every directed 
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passage formed by s sequential arrows 


>lclc Orel > oa > 
kok, 5 ky kp ; k3kq ap waked K,—2ks—1 ‘ k,-1ks 5 where Ko < ky ata ks : 


corresponds to the product @iyk, Ok; ko Akoks*** Uk,—2k,—1 Uk—1k, Of § Matrix elements in 
R. Under this visualization, formula (8.31) says that the element bj is equal to the 
sum of all oriented passages going from i to j, where all s-step passages are taken 
with sign (—1)°. 

Now consider the n x (2m) matrix and multiply it from the left by the 
matrix 


1 by bi3 ++ big) O 
O 1 by3+++ bo~1ry 0 


AY — > a * . ; 
O--- O 1 by—2)~-1) 0 
Oisas ead 0 l 0 
(ieee ate cue 0 1 


whose left upper (n—1) x (n—1) submatrix is inverse to the left upper (n— 1) x (n—1) 
submatrix of A by induction. Let C© SA and write S {AJE] = for the product. 
Then the left upper (n — 1) x (n — 1) submatrix of C equals the identity matrix, the 
bottom rows of and coincide, and the nth column of C consists of 
elements 


Cin = Gin + bi2d2n + D343n + +++ + Di(n—1)An—-1)n - 


By induction, the latter sum is formed by all oriented passages from n to i, where 
all (s + 1)-step passages are taken with the sign (—1)’. Eliminating the nth column 
of C over the main diagonal by adding appropriate multiples of the bottom row to 
all the other rows, we get in the rightmost column of the resulting n x (2m) matrix 
exactly those values b;,, predicted by (8.31). 


Problems for Independent Solution to Chap. 8 


Problem 8.1 For a ring R, the subring Z(R) = {c € R| Vx € R cx = xc} is called 
the center of R. Describe Z (Mat, (K)), where K is an arbitrary commutative ring 
with unit. 

Problem 8.2 For every square matrix of rank 1 over a field k, show that A = AA 
for some A € k. 

Problem 8.3 Realize the following transformations of a rectangular matrix over 
a commutative ring K via multiplication by an appropriate matrix B from the 
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appropriate side**: (a) swap the ith and jth rows, (b) multiply row i by A ¢€ K, 
(c) replace the ith row by its sum with the jth row multiplied by A € K, (d) similar 
transformations of columns. 


Problem 8.4 For the matrices 


=2.=)-=—] 5. =4 2-10-10 1 3 -6 2 -5 
-14 1-21 2-2 1 2 0 7 —2 3 —2 0 
—-1-2-5 4 -3]’ -7-12 2 -2]” -1 0 -2 1 -2]’ 
1-1 2-11 4 0-1-1 1 -40 -1 1 0 


explicitly choose a basis in the (a) linear span of the columns, (b) linear span of 
the rows, (c) annihilator of the linear span of the rows, (d) quotient of Q° by the 
linear span of rows, (e) quotient of Q* by the annihilator of the linear span of the 
columns (f) dual vector space to the quotient of Q° by the linear span of the rows. 


Problem 8.5 Explicitly indicate some basis in the sums and intersections of the 
following pairs of vector subspaces in Q?: 


(a) linear spans of vectors (1, 1,1, 1), (,—1,1,—1), (1,3, 1,3) and (1, 2,0, 2), 
(1,2,1,2), (3, 1,3, 1), 

(b) linear span of vectors (1,1,0,0), (0,1, 1,0), (0,0, 1,1) and solution space of 
equations x; + x3 = 2x. +%3 +44 =x, + 2%. +23+244 = 0, 

(c) solution space of equations xj +x. = x2 +x3 = x3+x4 = 0, and solution space 
of equations xj +2x.+2x4 = x1 +22%2+4%3+2x4 = 3.x, +22+323+%4 = 0. 


Problem 8.6 For the following pairs of vector subspaces in Q’, find: 


(a) the linear span of the vectors (—11,8,—1,2), (—6,5,2,3), (—3,2,—1,0) and 
the linear span of vectors (—8, 4, 12, —4), (—6, —5, —9, 1), (—2, —3, —6, 1), 

(b) the linear span of the vectors (30,5,—9,1), (2,8,4,—3), (—6, —4,0, 1) and 
solution space of the linear equations 


2x, + 3x. -—44x3 +44 = —xX) — 24. +343 -— 44 = —X) +243-—44 = 0. 


(c) the solution spaces of the linear equations 


—3 x, — 10x. + 20x3 — 6x4 = 0, 6x, — 7x. —3x3—-—2x4 = 0, 
—15x, + 4%.+19%x3-3x4=0, and —7x, —x%. + 6x3 —x4 = 0, 
6x; —10x3 + 2x4 = 0, 5x1 — 4x2. —3x3-x4 = 0. 


>4Give the side and explicitly write B. 
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Select the complementary pairs of subspaces, and for each such pair, project the 
standard basis vectors of Q* to the first subspace along the second. 


Problem 8.7 Verify that the annihilator of the linear form x; +x2+ --- +x, and the 
solution space of (n—1) linear equations x; = x2 = --- = x, are complementary 
in Q". Project the standard basis vectors of Q" to the first subspace along the 
second. 


Problem 8.8 Drawn on a sheet of graph paper is an m x n rectangle formed by 
segments of the grid lines joining four points where two lines intersect. The 
cells of an external circuit of the rectangle are filled by some rational numbers. 
Prove that the internal cells of the rectangle can be filled with rational numbers in 
such a way that each element equals the arithmetic mean of its eastern, northern, 
western, and southern neighbors. How many such fillings are possible? 


Problem 8.9 Six edges of a regular tetrahedron are marked by rational numbers 
b,, b2,...,b6. Let us call such a marking admissible if there exists a marking 
of four faces of the tetrahedron by rational numbers such that the sum of the 
numbers on each pair of faces equals the number on their common edge. Find 
all admissible markings (b,, b2,...,b6) € Q®, and for each of them describe all 
compatible markings of faces. 


Problem 8.10 Eight vertices of a cube are marked by rational numbers 
b,, b2,..., bg. Call such a marking admissible if there exists a marking of six 
faces of the cube by rational numbers such that the number on each vertex 
equals the sum of numbers on three faces sharing this vertex. Find all admissible 
markings (b,,b2,..., bs) € Q§, and for each of them, describe all compatible 
markings of faces. 

-1 


all 
Problem 8.11 For nonzero a € C, find | 0a 1 
00a 
Problem 8.12 For arbitrarily given a), da2,...,a, € N, consider the matrix A € 


Mat,(C) with elements e27'/“ on the secondary diagonal and zeros in all other 
places. Find the minimal m € N such that A” = E. 


Problem 8.13 (Commutators) Given two square matrices A, B € Mat,(k) over a 
field k, the difference [A, B] “ AB — BA is called the commutator of A, B. For 
all A, B,C € Mat,(k), prove the Leibniz rules: (a)[A, BC] = |A, B]C + BIA, C], 
(b) [A, [B, C]] = [[A, B], C] + [B, [A, C]]. 

Problem 8.14 Express (A + B)” through A'B/ if (a)[A, B] = 0, (b*)[A, B] = B, 
(c*)[A, B] =A. 

Problem 8.15 (Trace) The sum of all the elements on the main diagonal of a square 
matrix A is called the trace of A and is denoted by trA ae aj. In the matrix 
algebra Mat,(K) over a commutative ring K, prove that (a) tr[A, B] = 0 for all 
A, B, (b) tr(C~!AC) = tr(A) for every A and invertible C. 

Problem 8.16 In the vector space Mat, (k) over a field k, consider a system of linear 
equations tr(AX) = 0 in the unknown matrix X, where A € Mat, (k) runs through 
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all matrices with nonzero trace. Show that the solution space of this system is the 
1-dimensional space of scalar matrices X = AE, A € k. 

Problem 8.17 (Nilpotent Matrices) A square matrix A is called nilpotent if A” = 0 
for some n € N. Assuming that A, B € Mat, (k) are nilpotent, show that: (a) A+B 
is not necessarily nilpotent, (b) if [A,B] = 0, then A + B is nilpotent, (c*) if 
IA, [A, B]] = [B, [B, A]] = 0, then A + B is nilpotent. 

Problem 8.18 (Unipotent Matrices) A square matrix A over a field k is called 
unipotent if E — A is a nilpotent matrix. Show that 


(a) if chark = p > 0, then A € Mat,(k) is unipotent if and only if A” = E for 
some m € N, 
(b) if chark = 0, then A € Mat,(k) is unipotent if and only if 


1 i 
A= =} ON /iL=E+N+5N) + 2N? + os 
k20 


for some nilpotent matrix N € Mat,(k) (note that for all such N, the sum is 
finite). 


Problem 8.19 Solve the following equations in the unknown matrix X € Mat,(C): 
(a) X? = 0, (b) X? = 0, (c) X” = X, (d) X* = E, (e) X* = -E. 

Problem 8.20 For matrices A, B, C of sizes k x £, £ x m, m x n, prove that 
(a) rk(AB) < min(rkA,rkB), (b) rk(AB) + rk(BC) < rk(ABC) + rk(B), 
(c) rk(A) + rk(B) < rk(AB) + €. 

Problem 8.21 Let W = V @ V, where V = k", k a field. 


(a) Show that End(W) ~ Mat2x2(End(V)). 

(b) For an invertible matrix A € Mat,(k) and arbitrary matrices B, C, D € Mat, (k) 

such that rk © >) = n, show that D = CA7!B. 

CD 

(c) For invertible matrices A, B, C, D € Mat, (k), show that the matrices A—BD~'!C, 
C — DB“'!A, B — AC~!D, D — CA™'B are invertible in Mat, (k). Check that in 
Mat, (k), 


(: i 7 Career Caio 
CD). “\@=AC Dy 0=ce By" FF * 


Problem 8.22 (Mébius Inversion Formula) A poset” P is called locally finite if 
it has a lower bound”® p, € P and for all x, y € P, the segment 


kyl E{zeP|x<z<y} 


5See Sect. 1.4 on p. 13. 
26That is, if there is Dx © P such that m < x for all x € P. 
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is finite. For a locally finite poset P, write A = A(P) for the set of all functions 
Qo: Px P = R such that e(x,y) # 0 only if x < y. Define addition and a 
multiplication in A by 


Qi t+ Q2: (yy) a1, y) + Q2(,y), 


axa: yr Yo a dojy). 


xSzSy 


(a) Show that A is an associative R-algebra with unit and @ € A is invertible if and 
only if e(x, x) 4 0 forall x € P. 
(b) Define the Mébius function z € A to be the element inverse to the zeta function 


te 1) Nee 


0 otherwise. 


Show that u(x, y) = — ae L(x, Zz) = — eer [L(z, y). 

(c) Fora function g : P > R, the function o,(x) = dD <x &() is called the Mébius 
transform of g. Show that g is uniquely recovered from oy by the following 
Mobius inversion formula: 


g(x) = J o(y)uy,2). 


y<x 


(d) Give a precise intrinsic description of the Mébius inversion formula for P = N 
partially ordered by the divisibility relation n | m (compare with Problem 2.20 
on p. 39). 

(e) For a set X, give a precise intrinsic description of the Mobius inversion formula 
in the set of all finite subsets of X partially ordered by inclusion and compare 
the answer with the combinatorial inclusion—exclusion principle. 

(f) Deduce the inversion formula for upper unitriangular matrices in Example 8.17 
on p. 197 from the Mobius inversion formula. 


Problem 8.23 For a linear endomorphism F : V — V of a finite-dimensional 
vector space V over a field k, show that the minimal polynomial’ ju-(x) of F 
is divisible in k[x] by the products [[(x — A) taken over any sets of mutually 
distinct eigenvalues*® 4 € k of F. 


Problem 8.24 Find the kernel, the images, and the minimal polynomial for both 
difference operators A : f(x) B® f(x + 1) —f(x) and V : f@) & f(x) —ftx—- 1) 
on the space of polynomials of degree at most n over Q. 

Problem 8.25 For a polynomial f = ayx” + ayx"~! +--+ + dy—1x + dy € k[x] and 
a matrix A € Mat,(k), we put f(A) & apA” + aA”! +--+ + ay_\A + a,E and 


°7See Sect. 8.1.3 on p. 175. 
?8See Problem 7.13 on p. 170. 
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write k[A] C Mat, (k) for the image of the evaluation map ev, : k[x] > Mat,(k), 
f t f(A). Recall that the minimal polynomial 4 € |x] of A is the monic 
generator of the principal ideal kerev, C k[x]. 


(a) Find a matrix A € Mat)(Z) such that jz4(x) = x* — 2 and show that Q[A] is a 
field. 
(b) Find a matrix A € Mat)(R) such that R[A] ~ C and explicitly describe all 


a,b,c, d € R such that (: D € RIAI. 
Cc 


(c) Find the minimal polynomial of the matrix 


00 ...0 —ay, 
10... 0 —ay-1 
01 

A 0 —da2 
0... 01 —aj 


(d) For a diagonal matrix A € Mat,(k) with distinct diagonal elements, show that 
k[A] = {X € Mat, (Ik) | AX = XA}. 
(e*) Show that dimk[A] <n for all A € Mat, (k). 


Chapter 9 
Determinants 


9.1 Volume Forms 


9.1.1 Volume of an n-Dimensional Parallelepiped 


Let V be a vector space of dimension n over a field k. We are going to define 
the volume of a parallelepiped whose edges from some base vertex are n vectors 
U1, U2,..., Un as in Fig. 9.1. Such a volume is a function 


a:VxVx---xVok, vj, v2,...,Un  @(Uq, V2,...,Un)- (9.1) 


Fig. 9.1 Parallelepiped 


Vy 
V2 
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Fig. 9.2 Parallel shift Av, 


v2 


Vy 


Geometric intuition imposes two natural restrictions. 

First, the volume should not change its value when the parallelepiped undergoes 
a parallel shift of opposite (n—1)-dimensional faces with respect to each other along 
some edge parallel to both faces as in Fig. 9.2. Drawn there is the projection of the 
parallelepiped on the 2-dimensional plane spanned by edge v2 joining the shifted 
faces and edge v, providing the direction of the shift. The projection goes along the 
(n — 2)-dimensional faces complementary to the target plane of the projection. All 
these faces are mapped to the vertices of the parallelogram in Fig. 9.2. The parallel 
shift in question replaces v2 by v2+Av1. Since the prism cut from the left is attached 
back from the right, the total volume of the parallelepiped remains unchanged. 

Second, at least fork = Q,R and positive A, multiplication of any edge by 
A leads to multiplication of the volume by A. We would like to extrapolate this 
homogeneity property uniformly to arbitrary fields k and all A € k. Writing these 
geometric constraints algebraically, we come to the following definition. 


Definition 9.1 A function of n vectors @ : Vx Vx --- x V > k is called an 
n-dimensional volume form on an n-dimensional vector space V if it possesses 
the following two properties (all dotted arguments remain unchanged in both 
formulas): 


(1) o(...,u+Ay,..., 0...) = Oana g Urea g Uh ...) for all i # j and 
allA ek. 
(2) w(..., Av; ...) =Aw(..., v;,...) for alliand all A €k. 


Lemma 9.1 Every volume form vanishes on linearly related v\,V2,..., Un; every 
volume form is linear in each argument, 


w(...,Au+t pw, ...) =AW(...,U,... JF MO(... Weed (9.2) 
and every volume form alternates sign under transpositions of arguments: 
@(...,U,...,W,...) = —@(... ,W,...,0,...)- (9.3) 


Proof Yf vectors v1, v2,...,U, are linearly related, then one of them is a linear 
combination of the others, say vj = Azu2 + --- + A,Uy. Then by the first property 
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of a volume form, 


(v1, U2,..+, Un) = a(v1 —Agv2 — +++ —AnUy, V2, ---, Un) 
= w(0, U2, ... , Un) = W(0-0, v2, ... , Un) 
= 0-@(0, v2, ..., Un) = 0. 


The proof of (9.2) splits into two cases. If the arguments in each term on the right- 
hand side are linearly related, then the arguments on the left-hand side are linearly 
related too, and both sides of (9.2) vanish. Therefore, we can assume without loss 
of generality that the arguments of the first term on the right hand-side form a basis 
of V. Then a vector w is expressed through this basis as w = ov + u, where u lies in 
the linear span of the remaining (n — 1) arguments. The first property of a volume 
form allows us to rewrite the left-hand side of (9.2) as w(..., Au+ pw, ...) = 
w(...,A+tpo)utpmu,...) = w(...,(At+pme)v,...) and the second 
term on the right-hand side as ww(...,w,...) = W@(...,QU+tU,...) = 
[u@(..., QU, ...). Hence the whole right-hand side equals Aw(..., v0, ...) + 
ba@(...,QU,...) = A+ pe)@(..., uv, ...) and coincides with the left-hand 
side. The equality (9.3) follows from the linearity and the vanishing property already 
proven, 


0O=a(...,utw,...,v+W,...) =@(...,U, 2... ,W, -. JE@O(... Wy eee US eee), 


because 


9.1.2 Skew-Symmetric Multilinear Forms 


Definition 9.2 Let V be an arbitrary module over a commutative ring K. A Map 
o:VxVx-:-xV—> K is called a skew-symmetric multilinear! form if it is 
linear in each argument and vanishes if some arguments coincide. For example, 
the 2 x 2 determinant det(v;, v2) considered in Example 6.4 on p. 125 is a bilinear 
skew-symmetric form on k?. 


Example 9.1 (Volume Form) A volume form on an n-dimensional vector space V 
is skew-symmetric n-linear by Lemma 9.1. Conversely, every skew-symmetric mul- 
tilinear form of n arguments satisfies both properties from Definition 9.1 on p. 206 
and therefore produces a volume form on V. Indeed, the second property is just a 


‘Or m-linear if the number of arguments m needs to be indicated explicitly. 
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part of linearity, and the first property follows from linearity and skew-symmetry: 


w(...,u,+Av, ..., vj, ..-) 
Soi Beis 4 Ween) eae eae Uni Mee) 
Sian y Olek iy wee) 


Remark 9.1 (Sign Alternation vs. Skew-Symmetry) The above proof of equal- 
ity (9.3) shows in fact that every skew-symmetric multilinear form @ alternates sign 
under transpositions of arguments, i.e., satisfies 


W(...,U,...,W, ...) = —O(... Wy... Uy...) 
Conversely, the substitution w = v transforms the sign alternation condition to 
2w(...,U,...,U,...) = 0, which implies skew-symmetry if 2 = 1 + 1 is nota 


zero divisor in K. Note that for char K = 2, the sign alternation means the same as 
the invariance under transpositions of arguments, whereas the skew-symmetry adds 
an extra nontrivial constraint to this invariance. 


9.2 Digression on Parities of Permutations 


Let us treat permutations g = (g1,82,...,8n) of an ordered collection 
(1, 2, ... , n) as bijective maps 

OAL Dees gS Uy Ze see Ay DS Bi: 
These maps form the transformation group S, = Aut ({1, 2, ... , n}) discussed in 


Example 1.7 on p. 13. The composition fg of maps f, g € S, takes i to f(g@). For 
example, the permutation f = (2,4,3,5,1) € Ss is performed by the map 1 b 2, 
2h 4, 3h 3, 4h 5, 5% 1. Two of its compositions with the permutation 
g = (3,2,1,5,4) are fg = (3,4,2,1,5) and gf = (2,5, 1,4, 3). 

We write s;; for the permutation that swaps i and j and leaves all the other numbers 
fixed. The permutations s; are called transpositions. 


Exercise 9.1 Prove that every permutation is a composition of transpositions. 


We say that a permutation g is even (respectively odd) if g is the composition 
of an even (respectively odd) number of transpositions. Note that the factorization 
of g as a composition of transpositions is not unique. For example, even one 
transposition s;3 = (3,2, 1) € S3 can be written in three essentially different ways: 
S13 = 512823512 = S23512523. Nevertheless, the parity of g remains the same for all 
factorizations. We verify this by giving another definition of parity independent of 
factorization. 
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For g € S,, a pair of increasing integers (i,j) in the range | < i <j < nis called 
an inversion of g if g(i) > g(j). Therefore, each g € S, decomposes the set of all 
n(n— 1)/2 pairs (i,j) into two disjoint parts: inversions and noninversions. The total 
number of inversions is called the inversion number of g. We denote it by [(g). 


Lemma 9.2 For each transposition sj and arbitrary g € Sy, the inversion numbers 
I(g) and I(gsj) have opposite parities. 


Proof Leti <j. Then the permutations 


g= (g1, «++ > Si-1s Bi) Si41s «++ » 8i-1s Bj Sjt1s «+ Ba) 
: (9.4) 


§Sij = (g1, +++ > 8i-1s 87, Bi41s +++ » Si-1s Si» Sjt1s +++ > Sn), 


differ from each other by swapping the elements g; = g(i) and g; = g(j) at the ith 
and jth positions. The pairs of opposite inversion status” with respect to these two 
permutations are exhausted by (i,j) and the pairs (i, m), (m,j) with i < m <j. Thus 
7(g) — 1(gsy)| = 1+ 2G7-i-1). Oo 


Corollary 9.1 (Sign of a Permutation) There exists a unique map 
sen: S, > {+1,-l} 
such that sgn(Id) = 1, sgn(sj) = —1 for every transposition sj, and 
sgn(fg) = sgn(f) - sgn(g) 


for all f,g € S,. It is defined by sgn(g) = (—1)"®) and equals +1 if g is even, and 
—1 if g is odd. 


Proof If g € Sy is a composition of k transpositions, then (—1)/) = (—1)* by 
Lemma 9.2. oO 


Example 9.2 (Thread Rule) The parity of the inversion number can be determined 
geometrically as follows. Align the numbers 1, 2, ... , n and gj, g2,..., 8, in two 
rows and join equal numbers by smooth curves (threads) lying inside the rectangle 
with vertices 1, n, g,, g; and having only simple double crossings? as in Fig. 9.3. 
Then the total number of intersections of threads has the same parity as /(g). This 
method of computation of sgn g is known as the thread rule. 


?Inversions of one permutation and noninversions of the other. 


3That is, at each intersection point, only two threads cross, and they have different tangents at this 
point. 
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Fig. 9.3 sgn(2, 9, 6, 1, 8, 3, 5, 7, 4) = +1 (18 intersections) 


Exercise 9.2 Convince yourself that the thread rule is correct and use it to show 
that the shuffle permutation (i,, i2,..., tk, j1,J2,-++sjm), Where iy < in <-+- < ix, 
i <jo < +++ <jm, has sign (—1)/+2*&+D, where |J| # , iy is the weight of the 
multi-index J. 


9.3. Determinants 


Theorem 9.1 For a commutative ring K with unit there exists a unique, up to a 
constant factor, nonzero skew-symmetric n-linear form w on the coordinate module 
K". Its value on an arbitrary collection of vectors v = (v1, U2,...,Un), which is 
expressed through the standard basis e = (1, €2,...,€n) of K” as*v = e- Cyy, 
where Cey = (ci) € Mat, (K), is 


(U1, U2,...,Un) = W(e1, €2,.-.,€n) det (cj) , where 
det (ci) a y. SQN(81, 82,- +++ Bn) * Ce 1Cg92*** Coan (9.5) 
8ESn 
(the summation is over all permutations g = (1, 82,--+,8n) € Sn). 


Definition 9.3 The function det (ci) in the bottom row of (9.5) is called the 
determinant of the square matrix (ci). A matrix C is called degenerate if det C = 0. 
Otherwise, C is called nondegenerate. 


“Recall that the jth column of the transition matrix C is formed by the coordinates of the vector v; 
in the standard basis e. 
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Proof (of Theorem 9.1) Let us substitute vj = )7"_ e+ cj iN WV), V2,..., Um). By 
the multilinearity of w, we get 


(V1, V2, +145 Un) = of ) Ci, Ciy1 s ) Cin Cindy +e 5 ) e1Cign ) 


i 12 in 


= w (€i,, Cin, «+. Cin)? 5 Ciy1 * Cig2 *** Cin - 


iy in... in 


By the skew-symmetry of w, the nonzero summands in the latter sum are only those 


with distinct indices (i), i2,...,i,). Such indices are exactly the permutations of 
(i, 2,..., 1). Thus, 
+w(e1,€2,-..,€,) for even permutations (i), i2,..., in), 
WlEH panes 8i,) = - 
—w(e},€2,.-..,€,) for odd permutations (i), i2,..., i). 


This forces every skew-symmetric n-linear form w on K” to be computed by 
formula (9.5). Therefore, such a nonzero form, if it exists, is unique up to 
the constant factor w(e),é2,...,@,). To prove the existence, let us define w by 
(U1, U2,...,U_,)& det C,,. We have to check that this function is multilinear, skew- 
symmetric, and does not vanish identically. This will be done in Proposition 9.1 


below after we have developed some properties of determinants. Oo 


9.3.1 Basic Properties of Determinants 


The computational procedure encoded by the formula 


detC 2 S° sgn(g) + ¢g11Cg:2°* Conn (9.6) 


8ESn 


prescribes the listing of all n-tuples of matrix elements such that each n-tuple 
contains exactly one element of each row and exactly one element of each column. 
Every such n-tuple can be viewed as the graph of the one-to-one correspondence 
between rows and columns. Associated with such a correspondence are two 
permutations of the set {1, 2,..., m}: the first takes i to the number of the 
column corresponding to the ith row; the second takes i to the number of the row 
corresponding to the ith column. These two permutations are inverse to each other 
and therefore have the same sign. 


Exercise 9.3 Verify that inverse permutations have the same parity. 


We attach this sign to the n-tuple of matrix elements that is the graph of the 
correspondence. Then we multiply the elements in each n-tuple, and sum all these n! 
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products with the attached signs. For 2 x 2 and 3 x 3 determinants, this computation 
gives 


Ci C12 
aet( ) = €11C2 — C12€21, (9.7) 
C21 C22 
C11 C12 C13 
det | co) C22 C23: | = €11€22€33 + €13€21€32 + €12€23C3] 
C31 C32 C3 
eee = €11€23€32 — €13C22€31 — C12€21 C33 « (9.8) 


(In the second expansion, the first three summands correspond to the identity 
and two cyclic permutations, and the latter three summands correspond to the 
transpositions.) 


Exercise 9.4 Verify that det E = 1. 


Since rows and columns play completely symmetric roles in the computation just 
described, we conclude that a matrix and its transpose> have equal determinants: 


detC = a sgn(g) * Cy 1Cg22*** Conn = ” SEN(8) * Cle, C2g0***Cng, = detC’. 


gESn 8ESn 
(9.9) 
Proposition 9.1 Write v1, v2,..., Un for the columns of a matrix C € Mat,(K) and 
consider them as vectors in K". Then the function det(v, V2,..., Un) # det C is 


linear in each argument and skew-symmetric. 


Proof Each of the n! products in the expansion of det C contains exactly one factor 
taken from the jth column of C. Hence, detC is linear in v;. To prove skew- 


symmetry, let v; = v; and collect pairs of products corresponding to the pairs of 
permutations g and gsj, where sj is the transposition of i with j. Since Cz, = Cg,j 
and Cy; = Cgi, whereas sgn(g) = — sgn(gsjj), the products in every such pair cancel 
each other: 
sen(g) Colt Cgi tt * Cg tt * Coan + sgn(gsi) Cyl tC git Cg ttt Cgan = 0. 
Thus det(C) = 0 in this case. Oo 
Since by Exercise 9.4, det(v;, v2,..., U,) is a nonzero function of v1, V2,..., Up, 


Proposition 9.1 finishes the proof of Theorem 9.1. The equality detC = detC’ 
allows us to reformulate Proposition 9.1 in terms of rows. 


>Recall that a matrix C = (cj) and its transpose C’ = (<i) have ci; = cji. 
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Corollary 9.2 Considered as a function of the matrix rows, the determinant is 
skew-symmetric and linear in each row. 


Corollary 9.3. Every finite-dimensional vector space V possesses a unique, up to 
proportionality, nonzero volume form w. If some basis (e€1, €2,...,€n) in V is fixed 
and the form w is normalized by the condition 


w(e1,€2,..-,€n) =1, 
then the volume of an arbitrary collection of vectors 


(v1, U2, 2465 Un) = (e1, C2,--+5 €n)Cey 
is equal to the determinant of the transition matrix: (v1, U2,...,Un) = det Cey. 


Exercise 9.5 Show that det C = 0 for every C € Mat, (k) such that rk C < n, where 
k is a field. 


Proposition 9.2 det(AB) = det(A) - det(B) for every A, B € Mat,(K), where K is 
any commutative ring. 


Proof Consider the difference det(AB) —det(A)-det(B) as a polynomial with integer 
coefficients in 2n? variables aj and bj. It is sufficient to check that it is the zero 
polynomial. 


Exercise 9.6 Let k be an infinite field. Show that the polynomial f ¢€ 
k[x1,X2,...,Xm] is zero if and only if it assumes the value zero, f(p1,p2,...,Pm) = 
0 €k, for every point (p1, p2,...,Pm) € k”. 


Therefore, to verify the identity det(AB) = det(A)-det(B) in the ring of polynomials 
in the matrix elements it is enough to prove the proposition for all matrices with 
elements in Q. Write vj, v2,...,Un € Q" for the columns of A. If they are linearly 
related, then the dimension of their linear span is less than n. Since the columns 
of AB are linear combinations of columns of A, the ranks of both matrices A and 
AB are less than n, and detA = detAB = 0 by Exercise 9.5. Thus detAB = 
det A det B in this case. Now assume that the vectors vy = (v1, U2,...,Un) form a 
basis of Q” and write e = (e1,@2,...,@,) for the standard basis. Consider two 
volume forms w, and @, on Q” uniquely defined by @,(e1, €2,...,@n) = 1 and 
@,(V1,U2,.--,Un) = 1 respectively. By Corollary 9.3, they are proportional with 
coefficient @(v1, V2,..., Un) = det A: 


We = det(A)-a,. (9.10) 
Let w),W2,...,Wn € Q" be those vectors whose coordinate columns in the basis 


v are the columns of the matrix B. Then (w),W2,...,Wn) = (V1, U2,---,Un)- B= 
(€1,€2,...,@,) - AB. Evaluating both sides of (9.10) on the vectors w1, w2,..., Wn, 
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we get We(W1,W2,...,W,) = det(AB) and @,(w 1, Wo,...,Wn) = det(B) by Corol- 
lary 9.3. Therefore, detAB = detA det B for linearly independent v;, v2,..., VU, as 
well. oO 


Corollary 9.4 det(AB) = det(BA) for every pair of square matrices A,B € 
Mat, (K) over a commutative ring K. 


9.3.2 Determinant of a Linear Endomorphism 


Let V be a vector space of dimension n, and w an arbitrary nonzero volume form 
on V. Every linear endomorphism F : V — V makes another volume form 
r(V},V2,---,Un) “2 @ (Fu, Fro, ... , Fv) from the form a. 


Exercise 9.7 Verify that wp is multilinear and skew-symmetric. 


By Corollary 9.3, the form w,y is proportional to w. Hence, the ratio of their values 


on a basis €), €2,..., @, of V does not depend on the choice of basis. Since this ratio 
is 
O (Fei, Fea en gp PGR) w ((€1, €2,.--,€n)* Fe) _ O21, 2,...,€n) + det Fe 
w(e], €2,..-5€n) w(€1, €2,.--5€n) @(e], €2,..-5€n) 
= det F., 


where F, is the matrix of F in the basis e, the determinant det F, is the same for all 
bases e in V. It is called the determinant of the operator F and is denoted by det F. 
Under the action of F on V, the volumes of all parallelepipeds are multiplied by 
det F regardless of what volume form is used: 


o (Fu, Fv2,..., Fv,) = det F- w(vq, v2,..., Un) 


for an arbitrary volume form w on V and collection of n = dim V vectors v; € V. 
By Proposition 9.2, detFG = detFdetG for all F,G € EndV. Therefore, the 
endomorphisms of determinant 1, that is, the volume-preserving endomorphisms, 
form a subgroup in the general linear group GL(V). This group is called the special 
linear group and is denoted by SL(V) C GL(V). 


Exercise 9.8 Show that every linear endomorphism of determinant | is invertible 
and the inverse map also has determinant 1. 


The special linear group of the coordinate space k” consists of n x n matrices of 
determinant 1 and is denoted by SL, (k) C GL, (k). 
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9.4 Grassmannian Polynomials 


9.4.1 Polynomials in Skew-Commuting Variables 


A useful tool for handling determinants over a commutative ring K is the ring 
of Grassmannian polynomials in the skew-commuting variables £1, &,...,&, with 
coefficients in K. It is denoted by K<&, &,...,&, > and defined in the same way 
as the usual polynomial ring k[x),x%2,...,X,]. The only difference is that the 
Grassmannian variables anticommute instead of commute, that is, they satisfy the 
relations 


Vij EAE = EAE and Vi & A§& =0, (9.11) 


where the symbol A stands for the skew-commutative multiplication of Grassmann- 
ian variables in order to prevent confusion with the commutative multiplication 
of variables in a usual polynomial ring. If 1 + 1 does not divide zero in K, then 
the latter relation & A & = O in (9.11) follows from the former written for 
i = j. However, if 1 + 1 = 0, then the first relation in (9.11) says that the 
variables commute, whereas the second postulates the actual difference between 
the Grassmannian polynomials and the usual ones. It follows from (9.11) that every 
nonzero Grassmannian monomial is at most linear in each variable and up to a sign 
equals &;, A &, A --- A &,, for some iy < in < +++ < im. Thus, K<&, &,...,&)> is 
a free K-module with a basis formed by the monomials 


& = Ei, Abin Ao A Gin (9.12) 
numbered by all collections J = (i), i2,...,im), where 1 < i) < ip < +--+ <in X< 


n. Every permutation of variables multiplies such a monomial by the sign of the 
permutation: 


VEESm Signy A Siggy A 77+ A Sigtmy = SQ(8) - Fi, A Si, A ooe A Sin - (9.13) 


Grassmannian products of basic monomials (9.12) are expressed through those 
monomials as 


1,12, 0665 bims JisJ2s +++ sJk)° for IN J = 2, 
PRE = sgn(i1, i2 Im J1>J2 Jk) &nuy for (9.14) 
0 forINJAS, 


where sgn(ij, i2,....im. j1sj2.---sJk) = (—1)2™mt+D4+Liv by Exercise 9.2. 
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We write AK" C K<&,&,...,§&» for the K-submodule of homogeneous 
Grassmannian polynomials of degree m. It is free of rank Ci The entire ring of 
Grassmannian polynomials splits into the finite direct sum 


K<&1, &,...,£2» = GD A"K”, (9.15) 


m=0 


where A‘K" A A"K" C A‘t"K", Therefore, in contrast with the usual polynomials, 
the ring K<&,&,...,&,> is a free K-module of finite rank 2”. Note that the 
components A°K” and A”K” of minimum and maximum degree in the decompo- 
sition (9.15) both have rank 1. The first of them is generated by the zero-degree 
monomial €g = 1, which is the unit of K<&, &,...,&,>. The second is generated 
by the highest-degree monomial (1 2,..ny) = &1 A & A +++ A &;, which is annihilated 
under Grassmannian multiplication by every monomial of positive degree. 


Exercise 9.9 Check that Grassmannian monomials commute by the rule 


(Ei, AERA Oe AeA (Ei, egies A &,,) 
= (—1)*" (&, A &, Lott A §),) A (&i, A &, [eee A &i,) ‘ 


The exercise implies that multiplication of homogeneous Grassmannian polynomi- 
als f, g satisfies the Koszul sign rule 


fAg = (-l1)¥8f #889 af. (9.16) 


In particular, each homogeneous Grassmannian polynomial of even degree lies in 
the center of the Grassmannian polynomial ring 


Z(K<E1, &,...,&>) S| Ve, f Ag =Brfy. 


Exercise 9.10 Find a basis of Z(K<&\, &,...,&:>) over K. 


9.4.2 Linear Change of Grassmannian Variables 


Let the homogeneous linear forms 71,72,...,x € A!K" be expressed linearly 
through the homogeneous linear forms 6), &,...,&, € A'K” by means of the 
transition matrix C € Mat, ;(K), that is, 


(1, 2,--+.Mk) = (61, S2,..-, aa ee 
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Then the degree-m monomials ny = nj, A Nj, A ++: A nj, are linearly expressed 
through the monomials ¢; = ¢;, A ;, A «++ A ;,, as follows: 


Ne = Nj A Nir Not A Nin = (~~ gicuj) A o> bacon ) he OAK ‘oa binCindn ) 
i, i2 


tn 


_ > bi, A bin Ast AN Sin . > sgn(g) Cig (yi Cig(2yi2 a Cigtnyim 


1 Si <i2<**<in <n &ESin 


=a. 
I 


where the latter summation is over all collections J = (i, i2,..., im) of m strictly 
increasing indices and cy “ det Cy denotes the determinant of the m x m submatrix 
Cy C C, which consists of those cj with i € J, j € J. This determinant is called the 
IJth minor of order m in C. Thus, the //th element of the transition matrix from n, 
to ¢; is equal to the //th minor of the transition matrix from 7 to ¢. 
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For a collection of strictly increasing indices 


J = (jt.Ja,.-+sJm), l<ji <jz< t+ <jm SN, 
let us put degJ 2 m, |J| ij) +j2 + +++ + Jim, and write J = (ji, j2, --- . jn—m) = 
{1,2, ... ,2}~J for the complementary collection of increasing indices, which has 
degJ = n—~m. Associated with every square matrix A € Mat,(K) are n homo- 
geneous linear forms a, 0@2,...,@, € A!K” such that A is the transition matrix 
from them to the basic Grassmannian varieties &), &,..., &), i.€., (@1,Q@2,...,Q@n) = 
(&, &,...,&)-A or, in the most expanded form, 


Oj = Ep -ayt+ bay te tin ay, L<j<n. (9.17) 


For a pair of multi-indices J, J of the same length deg / = deg J = m, consider the 
Grassmannian monomials a; = aj, Adj, \ +++ Adj, and 7 = 0%, Adj, A+++ NOG 


of complementary degrees m and n—m. By formula (9.14) on p. 215, their product is 


x (—1) Ml 4mm + ))/2q), AA AN +++ AQn for / = J, (9 18) 
Ay NAF = : 
: 0 forl AJ. 
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As we have seen in Sect. 9.4.2, the left-hand side is expanded through 


£1, &,...,&, as 
(oe Ewaus ) “ ( é,a;7) = (1 PPE ADA + A YC) ayyamr 
M L M 


where M runs through all the increasing multi-indices of degree deg M = m. The 
right-hand side of (9.18) vanishes for J 4 J and equals 


Cee detA- & A & Avae K En 


for J = J. Thus, for any two collections J, J of m columns in a square matrix A, the 
following Laplace relations hold: 


detA for/=J, 
7 0 for! ~ J, 


where the summation is over all collections M of m rows in the matrix A. 

For J = J, the equality (9.19) expresses det A through the mth-degree minors ayy 
situated in m fixed columns of A numbered by J and their complementary minors 
azq of degree (n—m). The latter are equal to the determinants of the matrix obtained 
from A by removing all rows and columns containing the minor ay,;. The resulting 
formula is known as the cofactor expansion for the determinant: 


detA = S0(-I)™ lay aq (9.20) 
M 
The quantity (—1)I+V! 
minor ay,. 
For J # J, the relation (9.19) up to sign looks like 


aqqz is called the algebraic complement (or cofactor) of the 


SoD ay jamz = 0. (9.21) 
M 


In other words, if the minors ayy in (9.20) are multiplied by the cofactors 
complementary to the minors aj; situated in the other collection of columns J 4 J, 
then the resulting sum vanishes. 


Exercise 9.11 Prove the transposed version of Laplace’s relations 


detA for] = 
Sepa ange _ )de or J, (9.22) 
We 0 forl A J. 


Let us number all increasing multi-indices J of degree m in some order by integers 
from | to (""). Then we number all complementary increasing multi-indices by the 
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same integers in such a way that all complementary pairs J, J get the same numbers. 
Write A,, and AY for the (”) x (") matrices whose Ith elements are equal to 
ay and ay, = (—1)"'+"laz respectively.® In terms of these matrices, the Laplace 
relations (9.19), (9.22) are collected in two matrix equalities 

AY «Am = detA-E€ = An + AY (9.23) 


m? 


where € means the identity matrix of size (") x ("). Therefore, the matrices A,,, 


AY commute and are “almost inverse” to each other. 


Example 9.3 (Determinant of a Matrix Pencil) Given two square matrices A,B € 
Mat, (K), consider two commuting scalar variables x, y and form the matrix x-A+y-B 
with elements xa; + yb, € K[x, y]. Its determinant det(x-A + y-B) is a homogeneous 
polynomial of degree n in x, y. We are going to show that the coefficient of x”y""” 
in this polynomial equals 


YE) Vlayby =w(A,BY) , (9.24) 
i 


where the left summation runs through all increasing multi-indices J,J Cc 
{1,2,...,} of degree deg] = degJ = wm. Consider two collections of 
homogeneous linear forms 


(01, @2,...,Q@n) = (&,&,...,6:) A and (B1, B2,.--, Bn) = (&1, &,...,€:)-B 
in the Grassmannian polynomial ring K[x, y]<&1, &,...,&,>. Then 
(xa, + yB}) A (xaz + yBo) A CoA (xn + yBn) = det(xA + yB) - & A &5 Koxee A &n F 


Multiplying out the left-hand side, we get the monomial x”y""” when we choose 
the first summand within some m factors, say numbered by i), i2,..., im, and choose 
the second summand within the remaining (n—m) factors. Such a choice contributes 
the summand 


sgn(it, in,..-, im tt, ia, ---5tin—m) * Oi, A tin A o+ A Oty, A Bi, A Bz A + A Br _,, 
= (—1ymerFD/2+ My, A br _ eye ‘o> £1) A o> Evbyz) 
J M 


= ay nent Dar Digg? br NE 
JM 


= (S21! May - bz) Sy Ree he RE 
J 


in the resulting coefficient of x”y”"”. The sum of all these contributions, coming 
from all increasing multi-indices J = (i), i2,..., im), is equal to (9.24). 


Note that we swap indices /, J on the right-hand side of the latter formula. 
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Example 9.4 (Principal Minors and Trace) For x = 1, y = t, B = E, 
formula (9.24) gives the following expansion: 


det(tE + A) = ft" + 3 ny”. Ss au 


m=1 #=m 


=f4fr!. Dea wes Y\(ainayy — aaj) + + -+1t-) az+ deta, 


i<j 


where the coefficient of f”-” is equal to the sum of all the degree-m minors in A 
whose main diagonal sits within the main diagonal of A. All such minors are called 
principal. For example, the coefficient of r’~! is the trace 


n 


(A) 2 So aie. (9.25) 


i=1 


Note that tr(A + B) = tr(A) + tr(B) and tr(AB) = 97; ajbji = tr(BA). 


9.6 Adjunct Matrix 
9.6.1 Row and Column Cofactor Expansions 


For m = 1, the Laplace relations (9.23) deal with collections 7 = (i), J = (J), 
which consist of just one index. In this case, the minors aj; = aj become the matrix 
elements, and therefore, A; = A in (9.23). The second matrix Ay in (9.23) is called 
the adjunct matrix of A and is denoted by AY. It has the same size as A and is formed 
by the cofactors of the transposed elements in A: 


ay # (-1)Mag. (9.26) 


The minor aj; is equal to the determinant of the matrix obtained from A by removing 
the jth row and ith column. It is often denoted by A,;. The first Laplace relation (9.20) 


for m = | gives the expansion 
detA = re 1)Vajaz = De 1)ayAy, (9.27) 
i=1 


called the cofactor expansion of detA through the jth column. Its transposed 
version (9.22), 


detA = ye 1)Majaz = oc 1)VayAy, (9.28) 


j=l 
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is called the cofactor expansion of the determinant through the ith row. For example, 
the cofactor expansion of a 3 x 3 determinant through the first column looks like 


Q\1 412 Q13 
det | az) a2 a3 


a3) 432 433 
= ay (422433 — 2332) — ar, (412433 — 413032) + 431 (412423 — 13422) . 
The matrix relations (9.23) form = 1 become 
det(A) 0 


A-AY =AY -A = det(A)-E = om ‘ (9.29) 
0 det(A) 


9.6.2 Matrix Inversion 


Proposition 9.3. Over a commutative ring K with unit, a matrix A € Mat,(K) is 
invertible if and only if det A is invertible in K. In this case, A~! = AY / det A. 


Proof If Ais invertible, then the matrix equality AA~' = E forces det(A) det(A~!) = 
1 in K. Therefore, detA is invertible in K. Conversely, if detA is invertible, then 
formula (9.29) says that A~! = AY / det A. Oo 


Exercise 9.12 Prove that for every finite-dimensional vector space V, the following 
properties of a linear endomorphism F : V — V are equivalent: (a) kerF = 0, 
(b) im F = V, (c) det F ¥ 0. 
Example 9.5 For 2 x 2 and 3 x 3 matrices of determinant 1, we get 
-1 
ab _|{d—b 
cd ~ N-e a J’ 


—l 


1 412 a)3 (ay2.433 — 423432) —(12433 — 413431) (a\2423 — 413422) 
21 422 a3 = | —(a71433 — 433431) (411433 — 413431) —(a11 423 — 41342) 
431 432 433 (a21.432 — 472431) —(a11432 — 42432) (a, 1422 — a}2421) 


For matrices of arbitrary invertible determinant, all matrix elements on the right- 
hand sides must be divided by this determinant. 
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9.6.3 Cayley—Hamilton Identity 


? variables aj, 1 < i,j <n, and write K = Z [aj .<, tor the 


Let us introduce n <i 

usual commutative polynomial ring in these variables with integer coefficients. Then 
A= (a i) is a matrix from Mat, (K). Let ¢ be another variable commuting with all aj. 
Consider the matrix tE—A € Mat, (K[f]). Its determinant y,4(f) “= det(tE—A), which 
is a polynomial in ¢ with coefficients in K, is called the characteristic polynomial of 
the matrix A. Recall’ that every polynomial f € K[#] can be evaluated at any matrix 
M € Mat,(K) by the substitution® t 4 M. In particular, we can evaluate y4(t) for 
t=A. 


Theorem 9.2 (Cayley—Hamilton Identity) 4,4(A) = 0 in Mat,(K). 


Proof Formula (9.29) written in the matrix algebra Mat, (K[f]) for the matrix te —A 
says that 


det(tE — A)- E = (tE— A) - (tE— A)’. (9.30) 


Each B € Mat, (K[f]) can be written as #"By, +2" !Bn—1 + +++ +B, + Bo for some 
constant matrices B; € Mat,(K). Such an expanded version of (9.30) looks like 


E+5,(A)-t" E+ +++ +5,—1(A)-tE+so(A)-E = (tE—A) (*"AY + ++» + tAY + AY), 


where r’"A® + --- + tAy/ + Ay = (tE—A)Y and s;(A) € K mean the coefficients of 


the characteristic polynomial 7, (t). Substituting t = A we get the required equality 
XA (A) = 0. oO 
Example 9.6 Each 2 x 2 matrix A satisfies the quadratic equation? 

A® —tr(A)-A + det(A)- E = 0. 


Each 3 x 3 matrix A satisfies the cubic equation 


A? — tr(A) - A? + 59(A) -A — det(A) - E = 0, 


where 
Q\1 412 413 
S82 | a2) A22 dz3 | = (A11422 — ay2A21) + (411433 — 413431) + (a22433 — a23432) 
431 432 433 


is the sum of the principal 2 x 2 minors. 


7See Sect. 8.1.3 on p. 175. 
8Under this substitution, the degree-zero monomial 1° evaluates to M° = E. 
°Compare with formula (8.5) on p. 179. 
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Corollary 9.5 Every matrix A € Mat,(k) is algebraic over k and has degree at 
most n. The minimal polynomial of A divides the characteristic polynomial x(t) = 
det(tE — A). Oo 


9.6.4 Cramer’s Rules 


Consider the coordinate module K” over a commutative ring K with unit and write 
vectors v € K” as columns. For n vectors v1, V2,...,U, € K” expressed through the 
standard basis as (V1, V2,..., Un) = (€1, €2,---,€n) + C, we put, as usual, 


det(v;,v2,..., Un) & detC. 
Proposition 9.4 (Cramer’s Rule I) The vectors v1, V2,...,U, € K" form a basis 


of K" if and only if det(v1, v2,..., Un) is invertible in K. In this case, for every vector 
W = X, Vi +x2V2++ ++ +2X7Uz, the coordinates x; can be computed by Cramer’s rule!: 


det (v1, ... . Vi-l, W, Viti, «+. 5 Un 

x= det (vi, --., iit, Ws Vit, +++» Un) (9.31) 
det(v1, V2,..., Un) 

Proof We know from Lemma 8.1 on p. 181 that the vectors v;, v2,..., uv, form a 


basis of K” if and only if the transition matrix C from those vectors to the standard 
basis is invertible. 


Exercise 9.13 Convince yourself that the proof of Lemma 8.1 is valid for every free 
module V over a commutative ring K with unit. 


By Proposition 9.3, the invertibility of C is equivalent to the invertibility of det C. 
This proves the first statement. Now assume that det(v, v2,..., U,) is invertible and 
let w = x, v, + X2U2 +--+ + X,Uy. Then 


det (v1, sae g DET Wy. UFETee sy 3 Un) = det(v1, wee a Uizts ) XyVy, Vist, +++, Un) 
v 
= ) xX, -det(vj, ..., ViAq, Uy, Vit, «++ 5 Un) 
v 
= x;,-det (v1, ..., Ui-1, Ui; Vit, «+. 5 Un) - qo 


Corollary 9.6 For every w € K" and invertible A € Mat,(K), the linear equation 
Ax = w in the unknown vector x € K" has a unique solution. The coordinates of 
this solution are given by Cramer’s rule (9.31), where v1, V2,...,Un € K” are the 
columns of the matrix A. 


‘OCompare with formula (6.13) on p. 127. 
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Proof The system Ax = w describes the coefficients of the linear expression w = 
> UjiX;j. Oo 
Proposition 9.5 (Cramer’s Rule II) Given a system of n linear homogeneous 
equations inn + 1 unknowns (x0, X1, ---, Xn) 


Q0X0 + Ay1X1 + +++ + AnXy = O, 


A29X9 + A21X1 + +++ + AmXn = 0, 
(9.32) 
AnoX0 + Ani X1 + e+ + AnnXn = 0, 
we write Aj, where j = 0, 1, ... , n, for the determinant of the nxn matrix obtained 
by removing the jth column from the coefficient matrix A = (aj): 
410 °** Qij-1 A j+1°** Ain 
2.0 *** G2 j-1 A2j4+1°** 2.n 


Aj = det} ene ae (9.33) 


Gn0 *** Anj-1 Gnj+1 °** Ann 
Then x, = (—1)"A, solves the system. 


Proof Let us attach a second copy of the ith row to the top of the matrix A. We get 
an (n + 1) x (n + 1) matrix with zero determinant. The cofactor expansion of this 
determinant through the top row leads to the equality 


aipAg — aA + +++ + (-1)"ainAn = 0, 
which says that x, = (—1)’A, satisfies the ith equation of the system. Oo 


Corollary 9.7 Let K = k bea field. Then equations (9.32) are linearly independent 
if and only if Cramer’s rule IT produces a nonzero solution. In this case, all solutions 
of the system (9.32) are proportional to the Cramer’s-rule solution. 


Proof Yf the rows of A are linearly related, then all A; are equal to 0. If rkA = n, 
then by Exercise 8.9 on p. 187, there should be some n x n submatrix of rank 7 in A. 
Then its determinant A; is nonzero. This proves the first statement. If it holds, then 
the rows of A span an n-dimensional subspace U € k"+!. Hence, Ann(U) C k"*+!* 
is one-dimensional. Oo 


Example 9.7 The line of intersection of two distinct planes in k? is given by the 
equations 


ax + ay+a3z = 0, 
bix + boy + b3z = 0, 


and is spanned by the vector (a2b3 — a3b2, —a,b3 + a3b,, ayb2 — ab). 


9.6 Adjunct Matrix 225 


Problems for Independent Solution to Chap. 9 


In all the problems below, we follow our standard defaults: k means an arbitrary 
field, K means any commutative ring with unit. 


Problem 9.1 Calculate sgn(n, (n— 1), ... , 2, 1) and det 
1 0O 
Problem 9.2 For a A € Mat,(K), C € Mat,,(K), and B € Matyx(K), show that 


det ({ c) = detA-detC. 


Problem 9.3 Show that the determinant of a triangular matrix'! equals the product 
of its diagonal elements. 


Problem 9.4 Let all u;; equal 1 in the square matrix U = (uj). Calculate det(U—E), 
where E is the identity matrix. 


Problem 9.5 Two rows of a3 x3 matrix are filled with integers such that the greatest 
common divisor of each row equals 1. Is it always possible to fill the third row to 
create an integer matrix of determinant 1? 


Problem 9.6 Find the total number of (a) 2 x 2 matrices of given determinant over 
F, = Z/(p), (b) nondegenerate!” n x n matrices over F,. 


Problem 9.7 (Zolotarev’s Proof of Quadratic Reciprocity) For aa € F>, a 
permutation of the finite set F,, is given by x +> ax. Show that the sign of this 


permutation is equal to the Legendre—Jacobi symbol!* (¢). 
p,q > 2, consider two permutations of the finite set F, x F, ~ Z/(pq) defined by 


For prime integers 


Spi (%,y) b> (X,x+ py) and sg: (%,y)b (qxt+y,y). 


Relate their signs to the values (¢) and ( 1), calculate the sign of the composition 
Spe . , and deduce the quadratic reciprocity law'* from these observations. 

Problem 9.8 Find the sum of the determinants of all distinct n x n matrices whose 
entries are the numbers 1, 2, 3, ... , n?, each appearing exactly once. 


Problem 9.9 Consider a 3-diagonal square matrix A such that all elements on the 
main diagonal and on the next diagonal above equal 1, whereas all elements on 


See Example 8.17 on p. 197. 
'2See Definition 9.3 on p. 210. 
'3See Sect. 3.6.3 on p. 65. 

4See formula (3.23) on p. 66. 
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the diagonal below the main diagonal equal —1 (all the other elements are zero). 
Show that det(A) is among the Fibonacci numbers.!° 

Problem 9.10 Given a function f : Nx N — K, write (f(i,j)) © Mat,(K) 
for the square matrix whose (i,j) th element equals f(i,j). For any two col- 
lections of real numbers a, 02,...,Qn, 61, B2,..., Bn compute (a) det(a;;), 
(b) det(cos(a; — B;)), (e) det(o’'), (d*) det(a! med”), 

Problem 9.11 Show that for a set X, the functions f|,fo,...,f, : X — K are linearly 
independent in k* if and only if there exist points x1,x2,...,X, € X such that 
det(fi(t))) A 0. 

Problem 9.12 (Bordered Minors Theorem) Let the matrix A € Mat, x,(k) 
contain a nondegenerate square k x k submatrix B such that all (k + 1) x (k+ 1) 
submatrices of A containing B are degenerate. Show that rkA = k. 

Problem 9.13 (Kirchhoff’s Matrix Tree Theorem) For a connected graph I with 
n vertices numbered 1, 2, ... , m, consider the square matrix A = A(T) = (ai) 
such that each diagonal element aj; equals the number of edges going out of the 
ith vertex, and each nondiagonal element a,j; equals —1 if the ith and jth vertices 
are joined by some edge, and 0 otherwise. Prove that det A = 0 but each principal 
cofactor Aj; is equal to the number of spanning trees'® of the graph I’. To begin 
with, assume that I’ itself is a tree. Then show that T is a tree if and only if all A; 
are equal to 1. Then attack the general case. 

Problem 9.14 For a square matrix A € Mat, (k), write 


La, Ra, Ada : Mat, (k) > Mat, (k) 
for the linear endomorphism sending the matrix X € Mat,(k) to the matrices 
La(X)#2A-X, RA(X)2X-A, Ad4X)#A-X-AT, 


respectively. Calculate the traces and determinants of these endomorphisms. 


Problem 9.15 Write S? C k[x,,x2] for the subspace of homogeneous quadratic 
polynomials in x = (x;,x2). Associated with every matrix A € Mat2(k) is the 
linear change of variables 


Fa: k[x1,22] > kp], ff) f@-A). 


Show that this change of variables sends S? to itself and calculate the trace and 
determinant of the restricted map 


Fal :S° > S?. 
Problem 9.16 (Characteristic Polynomial) Let a linear endomorphism 


F:VoV 


See Example 4.4 on p. 81. 
'6Subtrees of I containing all the vertices of I’. 
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have matrix A in some basis of V. Prove that the characteristic polynomial 
xa(t) = det(tE — A) does not depend on the choice of basis. 


Problem 9.17 Let A,B € Mat,(K) satisfy the relation AB = E. Show that the 
complementary minors of A, B satisfy the relation ayy = (—1)!!+¥lpz. 


Problem 9.18 Calculate all partial derivatives 


d* det(A) 


Oj j, Oinjy ** + OGix 


Problem 9.19 (Pliicker’s Relations) Use the Laplace relations to show that the 
six numbers A;;, 1 < i < j < 4, are the second-degree minors of some 2 x 4 
matrix!” A € Matyy4(k) if and only if Aj2A23 — Aj3A424 + Aj4A23 = O. Is there 
some complex 2 x 4 matrix whose second-degree minors (written in some random 
order) are (a) {2, 3, 4, 5, 6, 7}? (b) {3, 4, 5, 6, 7, 8}? If such matrices 
exist, write some of them explicitly. If not, explain why. 


Problem 9.20 (Antisymmetric Matrices) A square matrix A such that A' = —A is 
called antisymmetric. Show that every antisymmetric matrix of odd dimension is 
degenerate. Verify that the determinant of an antisymmetric 2 x 2 or 4 x 4 matrix 
is a perfect square in the ring of polynomials in matrix elements with integer 
coefficients. 


Problem 9.21 Find all a, b, c, g such that the following matrices are invertible, and 
explicitly calculate the inverse: 


1 1 cos y — sin 10c 1a0 
a“ fae! y(n? o. t) (c)|0b0], @I0b0 
‘ a0l 0cl 


Problem 9.22 (Sylvester’s Determinants) Let k be a field. For two polynomials 


A) = ax" + ax 4 ik 
B(x) = box” a bist ttt Dmx + bm 


in k[x] such that agb)b # 0 and m = n, write Pa C k[x] for the subspace of 
polynomials of degree at most d and consider linear map 


Q: Pr-y B Pr—-1 > Pmtn-1 (f,g)t> Agt Bf. 


(a) Check that g is k-linear and write its matrix in the standard monomial bases. 
(b) Show that ¢ is bijective if and only if GcD(A, B) = 1. 


"Meaning that Aj equals the determinant of the submatrix formed by the ith and jth columns. 
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(c) For each v = 0, 1, 2, ... write d, for the determinant of the (m + n — 2v) x 
(m +n — 2v) matrix constructed from the (m +n) x (m+n) matrix!® 


gy... «++ An-1 An 
a... oe» An—| Am 
n 
ao see - An—| An 
Bg ons Dnt Di (9.34) 
m 
bo... Dy) Dn 
bo... Dm—-1 Dn 
———, 
m+n 


by removing v_ sequential exterior bordering perimeters. Show that 
deg GCD(A, B) coincides with the index of the first nonzero element in the 
sequence do, d}, do, .... 

(d) (Resultant) In the polynomial ring Z[t, ao, bo, 01, 02,...,Q@m B1, B2,---, Bal 
put 


m 


A(t) = aot” + ayt™ 1) +---+ap-it tam © ao] | (t—ai), 
i=l 


B(t) = bot + byt! +--+ dy t+ dy # bo |] (t- Bi) 


j=l 


The quantity Rae ¥ af [], B(ai) = afb) T],, (a — Bi) = (Ly 11,4 (63) 
considered as an element of the ring Z[ao,a,,...,4mbo0,01,..., by] is called 
the resultant of the polynomials A, B. Show that Ry gs equals the determinant of 
the matrix! (9.34) and can be represented as 


Rap =f()- AO + 8 - BO 


for appropriate f, g € Z|]. For all A, B € k[Z], show that”? R43 € k vanishes if 
and only if A and B are not coprime in k[x]. 


'8A]1 elements outside the strips shown in (9.34) equal zero. 
Called Sylvester’s determinant. 


20The number R4.g € k is the result of evaluation of Ry.g € Zlap,a),...,4m Do, bi, -... Dn] on the 
coefficients of given polynomials A, B € k[r]. 


Chapter 10 
Euclidean Spaces 


10.1 Inner Product 


10.1.1 Euclidean Structure 


Let V be a vector space over the field of real numbers R. An inner product on V is 
a function V x V > R sending a pair of vectors u, w € V to a number (u,w) € R 
such that (u, w) = (w, uw) for all u, w € V, (v, v) > 0 for all v ¥ 0, and 


(Ayu; + Agu. , Wiwi + 2W2) 
= Apo (uy, W1) + Ai p2(u1, Wo) + Azpi (U2, W1) + Arf2 (U2, W2) 
for all AyAo, 1, 2 € R and all uy, u2,w1,w2 € V. The first condition is called 
symmetry, the second, positivity, and the third, bilinearity.! A real vector space V 


equipped with an inner product is called a Euclidean vector space. An inner product 
on a Euclidean space is also called a Euclidean structure. 


Example 10.1 (Coordinate Space) Let V be the coordinate space R". For u = 
(x1,%2,---,%n), W = (V1, ¥2,---s Yn) in V put 


(u, w) = xyyy + x2y2 $+ + nn - (10.1) 


This function is obviously bilinear, symmetric, and positive. The inner prod- 
uct (10.1) is called the standard Euclidean structure on R”. 


'Bilinearity means that the inner product is linear in each argument while the other argument is 
fixed. 
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Example 10.2 (Continuous Functions) Let V be the space of continuous real- 
valued functions on some segment [a,b] C R. The inner product of functions 
f,g € Vis defined by 


b 
(f.g)= J feta) de. (10.2) 


Exercise 10.1 Check that (/, g) is bilinear and positive. 


If functions [a, b] — R are thought of as elements of an infinite-dimensional coor- 
dinate vector space R!"!, then the inner product (10.2) appears as a generalization 
of the standard Euclidean structure (10.1). There are various such generalizations in 
several directions. One can consider other classes of integrable functions instead of 
the continuous. The only constraint imposed by positivity is 


b 
f4#0> | P@)ad&>0. 


a 


Conversely, the inner product (10.2) can be restricted to some subspaces in V, e.g., 
on the space of polynomials of degree at most n. On the other hand, one can change 
the notion of integral, e.g., integrate with some weight. Finally, one can consider 
other integration domains instead of the segment. 


10.1.2 Length of a Vector 


Associated with a Euclidean structure on V is a length function 


V>Rso, vb lvl £ V(v,v). 


Note that |v| > 0 for all v 4 0 and |Av| = |A|-|v| for all A € R, v € V. The inner 
product V x V — R is uniquely recovered from the length function as 


(u, w) = (lu + wl? — |u|? — |w|*) (a= (w +wl|?—|u- w|’) /4. (10.3) 


10.1.3 Orthogonality 


Vectors u, w in a Euclidean space are called orthogonal” if (u,w) = 0. For every 
pair of orthogonal vectors a, b, the length of the vector c = b— a, which joins the 


?Or perpendicular. 
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heads of a, b (see Fig. 10.1), satisfies the Pythagorean theorem 


Ic? = (c,c) = (b— a,b —a) = (a,a) + (bb) = lal? + |B). 


Fig. 10.1 Right triangle 


A collection of mutually orthogonal vectors e;, é2,..., ex is called an orthogonal 
collection. If all vectors in an orthogonal collection have length 1, the collection is 
called orthonormal. Associated with every orthonormal basis e1, é2,...,e, of V is 
an isomorphism 


VOR", xe) t+ xe. +++ + Xen > (%1,%2,.--5Xn), 


which identifies the inner product on V with the standard one on R” described in 
Example 10.1. Indeed, the orthonormality relations 


a 
(c;,€) = Gian. (10.4) 
0 fori ¥j, 


force ()); xiei, 20) Wei) = Ley xvii. e)) = YL, xiy;- For an orthonormal basis, the 


ith coordinate x; = x;(v) of a vector v = )-; x;e; is equal to the inner product (v, e;), 
because 


(v, ej) = o> Xjej, ej) = >> xilei, ej) = Xj. 
The length of a vector can be computed by the generalized Pythagorean theorem: 


Jv? = (v,v) = Dox. 
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U1 


\(w1,e1)| 


Fig. 10.2 Second step of the Gram—Schmidt orthogonalization procedure 


Proposition 10.1 Jn the linear span of a collection u = (uj, Uz,...,Um) of vectors 
uj € V, there exists an orthonormal basis e = (€,,€2,...,ex) such that the 
transition matrix’ Cou = (ci) is upper triangular, i.e., has cy = 0 for alli > j. 


Proof Put ey = u;/|u;|. Then |e;| = 1, and e; spans the same subspace as u;. Now 
assume by induction that the linear span of uw, u2,..., uj; possesses the required 
orthonormal basis e1, é2,..., e¢. Then we put (see Fig. 10.2) 


p 
Wer = Uit1 — )(Ui41, er) * ev. 


v=l 


If we+1 = 0, then the vector u;+, lies in the linear span of e;, é2,..., eg, and we go to 
the next step with the same e), é2,..., e¢. If we4+1 4 0, then we+ does not lie in the 
linear span of 1, €2,..., e¢, because (We41, e;) = (ui41, e)—>_, (Ui41, ev) (ev, @) = 
(uit, &j) — (Uit1, ej) (e;, e7) = 0 forall 1 <j < £, whereas every linear combination 
w = Yo Aye, with some A; # 0 has (w, e;) = D>, Av(ev, ej) = A; A 0. Therefore, 
the vectors e),€2,...,@¢, We+1 are linearly independent and have the same linear 
span as the vectors uj, U2,..., Uj+1. We put e¢+1 = we+1/|we+1| and go to the next 
step. oO 


Remark 10.1 The process that constructs e),é2,...,e¢ from u1,U2,...,Um as 
described above is called the Gram—Schmidt orthogonalization procedure. 


Corollary 10.1 Every finite-dimensional Euclidean space has an orthogonal basis. 
oO 


3Recall that the jth column of C,, is formed from the coefficients of the linear expansion of the 
vector e; in terms of the vectors of u. 
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10.2. Gramians 


10.2.1 Gram Matrices 


In Euclidean space, associated with any two finite sequences of vectors 
U = (Uj, U0,...,Um) and w= (w1,Wo,...,We) (10.5) 
is the m x n matrix of reciprocal inner products between the vectors 
Guw = ((ui, Wj) (10.6) 


called the Gram matrix or just Gramian of collections u, w. For w = u, the Gramian 
becomes the inner product table ((u;, uj)) of vectors u = (uy, U2,..., Um). We write 
G, instead of G,, in this case. The Gram matrix G, is symmetric, 1.e., Gi, = Gy. 
Its determinant T, “ detG, is called the Gram determinant of the vectors u. The 
vectors € = (€,é2,...,@m) are orthonormal if and only if G, = E. In this case, 
oe 

Matrix computations with Gramians are simplified if we allow the multiplication 
of matrices whose elements are vectors by treating the product of two vectors as 
an inner product: uv & (v,u) € R. Then the Gramian G,, of two collections of 
vectors (10.5) becomes the product of a column matrix u‘ and a row matrix w: 


Guy = u'w. (10.7) 


Let u = eCe,, w = f Cry be linearly expressed in terms of some other collections of 
vectors 


e=(e},@,...,e-) and f=(fi,fo,...,fs)- 


Substitution of these expressions in (10.7) leads to Gy, = u'w = (eCeu)' f Crw = 
Cie’ f Cow = Coy Ger Crw. We conclude that Gramians G,» and Ge are related as 


Guy = Cha Get Cfw - (10.8) 
Lemma 10.1 Let e = (1, é2,...,@n) be an orthonormal basis of a Euclidean 
space V. ThenTy = det? Con for every collection of n vectors U = (uj, U2,...,Un) = 
e+ Cou. 
Proof Since Gy = Ci,GeCen = Ch,ECeu = Ci,Ceu and detC,, = detC!, , we get 
YT, = detG, = det?Cey . Oo 
Proposition 10.2 For every collection of vectors u = (uj, Uz,...,Um), the inequal- 


ity Ty = det ((u;, u;)) = 0 holds. It is an equality if and only if the vectors 
Uy, U2,...,Um are linearly related. 
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Proof Let e = (e), @2,...,e,) be an orthonormal basis in the linear span of vectors 
Uy, Uz,...,Um. Then Gy, = C),Cey. If u is linearly independent, then w is also a 
basis, kK = m, detC,, ~ 0, and T, = det?C,, > 0. If u is linearly related, then 
k < mand rkG, = rk(Cl,,Ceu) < k < m. Hence detG, = 0. oO 


10.2.2. Euclidean Volume 


Let us fix some orthonormal basis e = (e,é€2,...,@,) in a Euclidean space V 
and write @ for the volume form such that w(e), @2,...,é@,) = 1. Then for every 
collection of vectors v = (v1, V2,..., Un), We get a remarkable equality: 


@* (U1, U2,...,Un) =Ty. (10.9) 


That is, the Gram determinant of every collection of n vectors is the squared volume 
of the parallelepiped spanned by the vectors, if the volume form is normalized in 
such a way that the volume of some orthonormal basis is 1. Hence, the absolute 
value of the volume does not depend on the choice of orthonormal basis used to 
normalize the volume form. This absolute value is denoted by 


Vol(U1, V2,---, Un) = |@(V1, V2,..., Un)| = VT) = 4/det (vi, vj) (10.10) 


and called the Euclidean volume on V. As a byproduct, we get the following 
corollary. 


Corollary 10.2 All orthonormal bases of a finite-dimensional Euclidean space V 
have the same absolute value of volume with respect to any volume form on V. Their 
Euclidean volume equals 1. Oo 


10.2.3 Orientation 


We say that two orthonormal bases in a Euclidean space V are cooriented if they 
have the same volume with respect to some nonzero volume form on V. Otherwise, 
the bases are called contraoriented. The latter means that their volumes have equal 
absolute values but opposite signs. It follows from Corollary 10.2 that every two 
orthonormal bases are either cooriented or contraoriented, and this does not depend 
on the choice of volume form. Note that the odd permutations of orthonormal basis 
vectors as well as multiplication of some odd number of vectors by —1 changes the 
orientation of the basis. Even permutations of basis vectors preserve the orientation 
of the basis. 
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10.2.4 Cauchy—Bunyakovsky-Schwarz Inequality 


For a collection of two vectors v,w € V, the inequality det( 


(vv) @W)) 5 9 
(w, v) (w, wv) 
from Proposition 10.2 can be written as 


(v,v)- (w,w) = (v,w)? (10.11) 


and is known as the Cauchy—Bunyakovsky—Schwarz inequality. By Proposition 10.2, 
it becomes an equality if and only if the vectors v, w are proportional. 

For the standard Euclidean structure on R” from Example 10.1, the Cauchy— 
Bunyakovsky—Schwarz inequality (10.11) says that for every two collections of real 
numbers x1, X2,...,%, and yj, y2,..., Yn, the inequality 


(i tagt ce tx) OT ty t+ + +y2) S Gay ty + oo + nn)? 
(10.12) 
holds, and it is an equality if and only if the collections are proportional. 
For continuous functions f, g : [a, b] > R from Example 10.2 we get the integral 
inequality 


(fre dx) : (i dx) > (ko dx) 


which is an equality if and only if f = Ag for some constant A € R. 


10.3 Duality 


10.3.1 Isomorphism V = V* Provided by Euclidean Structure 


A Euclidean structure Vx V — R can be viewed as a perfect pairing between V and 
V in the sense of Sect. 7.1.4 on p. 160. In particular, it produces a linear map 


gi:VoOV", uP (*,u), (10.13) 


sending a vector u to the linear form g, : w b> (w,u) on V. This map is injective, 
because for v 4 0, the covector g, € V* takes a nonzero value g,(v) = (v,v) >0 
on the vector v ¢€ V. For finite-dimensional V, the injectivity of the linear 
map (10.13) is equivalent to bijectivity. 


Exercise 10.2 Check that the matrix of the linear map (10.13) written in any pair 
of dual bases e, e* of V and V* coincides with the Gram matrix G,. 
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Therefore, every linear form V — R ona Euclidean space can be written as an inner 
product with an appropriate vector uniquely determined by the form. In particular, 
the coordinate forms uj, v3,...,u7 € V* of any* basis v = (v1, U2,...,U,) inV 
can be written as inner products with some vectors Ue iy ees uN € V uniquely 
determined from the relations 


0, fori Aj, 
(v", uj) = (vj,0") = pees (10.14) 
, fori =j. 


In the matrix notation introduced in formula (10.7) on p. 233, these relations are 
written as vy’: vY = E, where v’ and vY mean (v1, v2,...,U,) written as a column 
and (vu), vy,...,v,’) written as a row. Therefore, the transition matrix C,,v, which 
is formed by the coordinate columns of the vectors vY in the basis y, satisfies the 
relation E = v'- vY = v'-v-Cyv = G,- Cywv. Thus, Cyv = Gy!,ie., 


v¥ =y-G'. (10.15) 


Exercise 10.3 Check that v”Y = vj. 


The bases uv)’, vy,...,uY and vj, v2,...,U, in V are said to be Euclidean dual to 
each other. Every orthonormal basis is Euclidean dual to itself. For the orthogonal 
basis “1, W2,..., Un, the dual basis vectors are wY = u;/|u;|?. Since the coordinates 
of every vector w € V in an arbitrary basis vj, v2,...,U, are equal to the inner 
products with the Euclidean dual basic vectors, we get the equality 


w= > (w, 07) u. (10.16) 


Exercise 10.4 Check this assertion by taking the inner product of the both sides 
with vu. 


10.3.2. Orthogonal Complement and Orthogonal Projection 


Let V be a Euclidean vector space and U C V a subspace. The subspace Ut = 
g '(AnnU) = {w € V | Vu € U (u,w) = 0} is called the orthogonal? of U. 
If dimV = n < ov, then Theorem 7.2 on p. 162 implies that for every integer 
k in the range 0 < k < n, the correspondence U @ U+ establishes a bijection 
between k-dimensional and k-codimensional subspaces in V. This bijection reverses 


4Not necessarily orthogonal or orthonormal. 


Or orthogonal complement to U. 
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the inclusions and possesses the following properties: Ut+ = U, (UN W)+ = 
Ut+wt,(u+wt=uinwe. 


Theorem 10.1 (Orthogonal Decomposition) Let U be a proper nonzero finite- 
dimensional subspace of an arbitrary® Euclidean space V. Then 


V=UeUt, 


and for every vector v € V, there exists a unique vector vy € U possessing the 
following equivalent properties: 


(1) vy = zyv, where ty : V —> U is the projection along Ut. 
(2) For every pair of Euclidean dual bases uy, U2,..., uz and ik ae 1... Uf inU, 


k 


w=>) (v, uy) + uz. (10.17) 


i=1 


(3) (v,u) = (vy, u) forallu € U. 
(4) v—vy € Ut. 
(5) For allu 4 vy in U, the strict inequality |v — vy| < |v — u| holds.’ 


Proof Let us verify first that properties (2), (3), (4) are equivalent. The equivalence 
(3) <=> (A) is obvious, because for all u € U, the equalities (v,u) = (vy, u) and 
(v — vy,u) = 0 mean the same thing. Since both sides of the equality (v,u) = 
(vy, u) in (3) are linear in u, this equality holds for all wu € U if and only if it holds 
for every basis vector of some basis in U. If vy € U satisfies (2) for some dual bases 
of U, then (3) holds for every basis vector uy: 


(vu!) = (SD (ean) ws) = (vo) = (uot) = (0a). 


v 


Thus (2) => (3). Conversely, if a vector vy € U satisfies (3), then its expression in 
terms of every basis uy, U2,..., ux in U by formula (10.16) is (10.17). Therefore, 
(3) => (2). 

Now let us choose some basis in U and define the vector vy by the for- 
mula (10.17). Then vy satisfies (3) and (4) as well. If we perform this for all v € V, 
we may write every v € Vasu = vy + (v— vy), where vy € U andv—vy € Ut. 
Hence, V = U + Ut. Certainly UM U+ = 0, because every w € UN Ut is 
forced to have (w, w) = 0. Therefore, V = U@® U+ and vy = zyv. This proves the 
first statement of the theorem as well as the uniqueness of vy and the equivalence 
between (1) and (2), (3), (4). It remains to check that for every nonzero vector 
w € U, the strict inequality |v — vy| < |v — (vy + w)| holds. Since both sides 


Not necessarily finite-dimensional. 


In other words, the minimal distance between the point v € A(V) and the points of the affine 
space A(U) is achieved at a unique point of A(U), and this point is vy. 
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are positive, it is enough to verify the same inequality between their squares. Since 
2 

v—vy € U4, we have Ju — (w+ vu)| — ((v - vy) —w,(v — vy) — w) — 

(v — vy, v — vy) + (w, w) = |v — vy? + |w|? > |v — vy|? as required. Oo 


Definition 10.1 (Orthogonal Projection) Under the conditions of Sect. 10.3.2, the 
linear projection zy : V —» U along U*+ is called the orthogonal projection of V 
onto U. 


Example 10.3 (Euclidean Volume of a Parallelepiped Revisited) Let V be a 
Euclidean vector space of dimension dimV > n. Consider a collection of n + 1 
vectors U, U1, U2,...,U, € V and write U for the linear span of the last n vectors 
Uy, U2,...,Un. By Theorem 10.1, we can write the first vector v as v = myv +h, 
where zyv € U is the orthogonal projection of v on U and h € U+. Note 
that geometrically, |h| is the distance between the opposite n-dimensional faces 
parallel to uj, u2,...,U, in the (nm + 1)-dimensional parallelepiped spanned by 
V, Uy, Uz,...,Uy,. Since zyv is a linear combination of vectors u;, the Euclidean 
volume of the latter parallelepiped equals 


Vol(U, Uy, U2,...,U_) = |det(v, uy, U2,...,Un)| = |det(h, uy, uo,..., Uy)| 
= A Thuy uo wea Un * 


Expansion of the Gram determinant T°, ., w,....,, in terms of the first column leads to 
the equality 


Vol(v, Uy, U2,...,Un) = |A|- Vol(uy, u2,..., Un). (10.18) 


In other words, the Euclidean volume of a parallelepiped equals the Euclidean 
volume of its codimension-1 hyperface multiplied by the distance between this face 
and the opposite one. 


10.4 Metric Geometry 


10.4.1 Euclidean Metric 


In this section we deal with the affine space A” = A(V) associated with an n- 
dimensional Euclidean vector space V. For two points p,q € A(V), the length of 
a vector pg = q — p is called the Euclidean distance between those points and is 
denoted by 


lp.q| = pal = /(q—p.4q—-p). (10.19) 


Recall that a set X is called a metric space if it is equipped with a distance function 
XxX > R,p,q + |p,q|, such that for every triple of points p,g,r € X, the 
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following properties hold: 


pal = |a,P| (symmetry), 
lp, q| = 0 (nonnegativity), 
(10.20) 
Ip.q| =0 <= > p=q_ (nonsingularity), 
Ip.4q| <\|p.r| + |r.q| (triangle inequality). 


The Euclidean distance (10.19) certainly satisfies the first three properties. The 
triangle inequality rewritten in terms of vectors asserts that 


VusweV |u| + |w| = lu+wi. (10.21) 
The Cauchy—Bunyakovsky—Schwarz inequality (10.11) forces 
lu + wl? = |u|? + lw]? + 200, w) < Jul? + fw]? + 2a] = [w] = (lel + Jv)? 


Hence the triangle inequality (10.21) holds for the Euclidean distance as well. Thus, 
every affine Euclidean space A(V) is a metric space, and it inherits all the desirable 
properties of those spaces known from calculus. 


Exercise 10.5 Show that equality is achieved in (10.21) if and only if w = Au for 
some A > 0. 


Example 10.4 (Distance Between a Point and a Subspace) For a point a € A” and 
affine subspace II C A” such that a ¢ II, Theorem 10.1 says that the minimal 
distance |a, p| taken over all p € II is achieved at a unique point ay € TT, which is 
determined by the orthogonality condition (@. dar) = 0 for all g,p € T. Sucha 
point ay € T1 is called the orthogonal projection® of a. The distance 


|a, TI| = |a,an| = min ja, p| 
pell 


is called the distance between a and T1. If a € TI, we put |a, T1| = 0. To compute 
|a, II|, let us fix some point g € II and write U C V for the vectorization of 


TI centered at g. Then the vector gay. = my (qa) is the orthogonal projection of 
the vector v = ga on U. The distance |a, | = |v — zyv| equals the height of 


parallelepiped spanned by v and any basis 1, u2,..., uz in U. The formula (10.18) 
on p. 238 expresses this height as the ratio of Gram determinants: 


la, |? = Pciia ror tn! Vn ay ree Um? (10.22) 


8Or the foot of the perpendicular drawn from a onto TI. 
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where v = qa, q € I is an arbitrary point, and the vectors uj; = Di form a basis of 
U, the direction vector space of IT c A”. 


Example 10.5 (Minimal Distance Method) In practice, the distances between 
points and affine spaces appear often in the context of minimization. For example, 
let us find the minimum of the integral 


1 
[ Poa 
0 
over all monic cubic polynomials f(t) = f + a)t? + aot + a3 € Rp]. It equals 
the distance between the zero polynomial a = 0 and the 3-dimensional affine 


space II of monic cubic polynomials in the affinization of Euclidean space from 
Example 10.2 on p. 230 with inner product 


1 
(f.g) = [ f() -e( at. 


Take q = as the origin in IT and use m4 = 1, uw = t, 3 = f as a basis in the 
vectorization of I centered at g. Then v = ga = —t°. Since the inner products of 
basis vectors are 


1 
ee i} fg = 45-17, 
0 


the Gram determinant of the basis is 


1 1/21/3 
Vy uo, = det | 1/2 1/3 1/4] = ——. 
waqeigy 


The Gram determinant of the collection (v, v1, u2, 43) = (-2, 1, t, 17) is 


1/7 -1/4-1/5 -1/6 
-1/4 1 1/2 1/3 —_ 
Dyan, = det = ; 
Uy 3 7 —1/5 1/2 1/3 1/4 6048 000 


—-1/6 1/3 1/4 1/5 
By formula (10.22), the square of the minimum we are looking for is equal to 


2160 
6048000 2800" 
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Exercise 10.6 Find the polynomial f(¢) on which the above minimum is achieved 
and verify the previous answer by explicit evaluation of the integral /, a f° @ dt. 


Example 10.6 (Equation of a Hyperplane) For every nonzero vector a € V and 
d € R, the inhomogeneous linear equation 


(a,x) =d (10.23) 


in the unknown vector x € V describes an affine hyperplane in A(V). This 
hyperplane is perpendicular to the vector a and is shifted the distance |d|/|a| from 
the origin in the direction of the vector a for d > 0 and in the opposite direction for 
d < 0. Indeed, the direction vector space of this hyperplane, which is determined 
by the homogeneous equation (a,x) = 0, is the orthogonal to the 1-dimensional 
vector space R - a. A particular solution of the inhomogeneous equation (10.23) is 
provided by the vector x = d+ a/|a|*. It is proportional to a and has length |d|/|a|. 
Geometrically, equation (10.23) describes the locus of all points x with prescribed 
orthogonal projection 2.x = (a,x)-aY = d-a/|a|* on the line R - a (see Fig. 10.3). 


Fig. 10.3. Locus of x: (a,x) =d 


Exercise 10.7 Check that the distance from a point p to the hyperplane (10.23) is 
|d — (a, p)|/lal. 


Example 10.7 (Equidistant Plane) For two distinct points a,b € A(V), their 
equidistant is the locus of all points x € A(V) such that |x, a] = |x, b|. The latter 
condition is equivalent to the equality (a — x,a—x) = (b—x,b — x) involving 
the radius vectors a,b,x € V. The distribution of parentheses and cancellation of 
quadratic terms turns it into the linear equation 2(b — a,x) = (b,b) — (a,a). We 
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know from Example 10.6 that this equation describes the hyperplane perpendicular 
to the segment [a, b] and passing through its midpoint (a + b)/2. 


Exercise 10.8 Check the last claim. 


For this reason, the equidistant of a, b is also called the middle perpendicular to 
la, b]. 


10.4.2. Angles 


The angle & (u, w) € [0, 2] between two nonzero vectors u, w in a Euclidean space 
V is defined by 


(u, w) 


cos £ (u, w) = lu) bw) : 


(10.24) 


The Cauchy-Bunyakovsky—Schwarz inequality? guarantees that the right-hand side 
belongs to the segment [—1, 1], which is the range of cosine values, and that 


wt@wet oes. 
|w| |u| 


The angle is symmetric in u, w, and is not changed under the multiplication 
of vectors by positive scalars. A change of direction of any one vector changes 
the angle by the contiguous angle: 4 (u,—w) = A(—u,w) = aw — 4(u,w). 
Orthogonality of u and w means that 4 (u, w) = 2/2. 

If u, w are not collinear, the vectors e} = u/|u| and es = v/|v|, where v = 
w — (w, e;)- e;, form an orthonormal basis e;, e2 in the linear span of u, w. 


Exercise 10.9 Check that the bases e;, e2 and u, w of that linear span are 
cooriented. 


The coefficients of the orthogonal decomposition w = x ;e; + x2é2 are 
x, = (w,e1) = (w, u)/|ul = |w] - cos 4 (u, w) 


and x. = (w,€2) = (w,v)/|v| = ((w, w) — (w, e1)”) /|v] = (1 — cos? 4 (u, w)) - 
|w|?/|v|. Since the latter is positive, the equality x} + x5 = |w|? forces 


X2 = |w|- sin 4 (u,w) . 


See formula (10.11) on p. 235. 
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On the other hand, by Cramer’s rule,'° x. = det(e;,w) = det(u, w)/|u|, where 
det(u, w) is the oriented area!! of the parallelogram spanned by u, w. Thus, we get 
another formula for the angle between u and w: 


dnt Gey wheeetag = Jdet(u, w)| = VTaw- (10.25) 


|u| - |w| 


In contrast with (10.24), formula (10.25) does not distinguish between acute and 
obtuse angles. 


Example 10.8 (Angle Between a Vector and a Subspace) For every nonzero vector 
v € V and vector u running through some subspace U C JV, the angle 4 (v, u) 
either equals 2/2 for all u € U or reaches its minimal value over all u € U exactly 
on the ray of vectors Azyv, A > 0. The first case occurs for v € U a Otherwise, the 
minimum of & (v, u) corresponds to the maximum of 


EL GS ee ee 


z = = = 
eal lupe el 


The Cauchy—Bunyakovsky—Schwarz inequality!” forces the rightmost factor to 
reach its maximum at the unique unit vector u/|u| that is codirectional with myv. 
Note that 4 (v, zyv) is acute in this case. We write 4 (v, U) = 4 (v, myv) for the 
angle between the vector u and its orthogonal projection on U and call it the angle 
between v and U. This angle can be computed by the formulas!* 


cos 4 (v, U) = |xyv|/|v|, (10.26) 


sin 4 (v, UV) = |v — xyd|/|v| = VT vay tas estin / Py az.tyn [UIs 


where uj, U2,..., Um 1S any basis in U. 


Example 10.9 (Angle Between Hyperplanes) Let dim V = n = 2. Then for any two 
hyperplanes IT,, 2 c A” = A(V) with different direction subspaces W,, W2 C V, 
the codimension codim (W; M W2) is equal to 2. Hence IT; M [2 is a nonempty 
affine subspace of codimension 2 in A” perpendicular to the 2-dimensional affine 
plane A ((W; 9 W)+). 


Exercise 10.10 Check this. 


The latter plane intersects hyperplanes II, [12 along two nonparallel lines. The 
nonobtuse angle between these lines is called the angle between the hyperplanes 


'0See formula 6.13 on p. 127. 

‘Normalized by the condition det(e;,e2) = 1. 
See formula (10.11) on p. 235. 

'3Compare with formula (10.22) on p. 239. 
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II,, Iz. We denote it by 4 (II,, I12). For parallel hyperplanes having W; = W), 
we put 4 (I];, I) & 0. If hyperplanes T,, I, are given by the linear equations 
(a,x) = d, and (a,x) = d, respectively, then wt = R- aq; and the plane (W, N 
W,)t = wi + w+ is spanned by the vectors a,, a2. The intersection lines of this 
plane with IT, and IT, are perpendicular to these vectors. Thus, 


sin 4 (II, II,) = siné (a1, a2) =-¥ Veal (la; | . |a>|) : 


10.5 Orthogonal Group 
10.5.1 Euclidean Isometries 


A linear endomorphism F : V — V of a Euclidean vector space V is called 
orthogonal’ if it preserves lengths, that is, if |Fv| = |v| for all v € V. 
Formula (10.3) on p. 230 forces every orthogonal map F to preserve all inner 
products as well, that is, (Fv, Fw) = (v, w) for all v, w € V. Thus, every orthogonal 
map sends an orthonormal basis to an orthonormal basis. Conversely, if a linear map 
F: V — V takes some basis of V to some vectors with the same Gramian, then F 
preserves inner products of all vectors and therefore is orthogonal. 

An orthogonal map that preserves the Euclidean structure also preserves all 
geometric quantities derived from it, e.g., angles and Euclidean volume. Preser- 
vation of Euclidean volume forces det F = +1. In particular, all orthogonal maps 
are invertible. Orthogonal maps of determinant +1 are called proper. Such maps 
preserve the orientation. Orthogonal maps of determinant —1 are called improper. 
They reverse the orientation. 

The set of orthogonal endomorphisms forms a subgroup of the general linear 
group GL(V). It is called the orthogonal group of V and is denoted by O(V) C 
GL(V). The proper orthogonal automorphisms form a subgroup 


SO(V) “ O(V) NSL(V) 


in O(V). It is called the special orthogonal group of V. 


10.5.2. Orthogonal Matrices 


Choose some orthonormal basis e = (e), é2,...,é@,) in V and represent all linear 
maps F : V — V by their matrices F, in this basis. Since F(e) = e- Fe, the 
Gramian Gre) is equal to F(e)'F(e) = Fle'eF, = F'G.F. = F!F,. Therefore, the 
orthogonality of F, which means Gry) = E, is equivalent to the relation F tFo = E 


'4Or isometric. 
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on the matrix of F in any orthonormal basis in V. A matrix C € Mat,(R) is called 
orthogonal if C'C = E, or equivalently, C’ = C~!. Thus, a map F is orthogonal 
if and only if its matrix in some orthonormal basis is orthogonal. The orthogonal 
matrices form the multiplicative groups 


O,,(R) £ {C € Mat, (R) | C'C = E}, 
SO, (R) “ {C € O,(R) | detC = 1}, 


called the orthogonal and special orthogonal real matrix groups. 


Example 10.10 (Isometries of the Euclidean Line) Let F be an orthogonal auto- 
morphism of the Euclidean line spanned by the vector v. Then Fu = Av and 
|v| = |Fuv| = |A|-|v| forces A = +1. Thus, the isometries of the Euclidean line are 
exhausted by +Id. 


Example 10.11 (Isometries of the Euclidean Plane) Let an orthogonal automor- 
phism of the Euclidean plane U have matrix 


F= (: ) 
cd 
in some orthonormal basis e;, e¢2 of U. The orthogonality relation F‘F = E is 
equivalent to the system of equations 


e+ce=1, 
P+da=1, 
ab + cd = 0. 


All solutions of the first two equations are parametrized as 


a=cosg, c=sing, 
b=siny, d=cosy. 


Then the third equation is equivalent to the relation sin(w + g) = 0. Hence, up to 
addition of integer multiples of 27, either y = g or y = x — g. In the first case, 


FH cos — sing 
~ \sing cosg 


is the counterclockwise rotation through the angle g about the origin. In the second 


case, 
aes (ee sing ) 
sing —cosg 
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is the previous rotation composed with the preceding orthogonal reflection e; +> e1, 
é2 +» —é in the first coordinate axis. Such a composition is the reflection in the 
bisector of the angle between the vectors e; and Fe; (see Fig. 10.4). 


Fig. 10.4 A reflection composed with a rotation is a reflection 


Exercise 10.11 Prove this. 


Thus, the proper linear orthogonal automorphisms of the Euclidean plane are 
exhausted by rotations about the origin, while the improper linear orthogonal 
automorphisms are exhausted by the orthogonal reflections in lines passing through 
the origin. 


Example 10.12 (Reflections in Hyperplanes) A vector e € V is called a unit vector 
if |e] = 1, or equivalently, (e, e) = 1. Associated with such a vector is its orthogonal 
hyperplane e+ C V and an improper linear isometry o, : V > V that sends e to —e 
and acts identically on e+ (see Fig. 10.5). The isometry 0, is called an orthogonal 
reflection’ in the hyperplane e+. In the language of formulas, o, is described as 


Oe: Vr v—2(v,e)e. (10.27) 


Exercise 10.12 Verify that the map (10.27) is linear, acts identically on e+, sends e 
to —e, and preserves the inner product. 


'SOr just a reflection for short. 
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For an arbitrary nonzero vector a € V, we write o, for the orthogonal reflection in 
the hyperplane at. It coincides with the reflection o, for the unit vector e = a/|a| 
and is described by the formula 


(v, a) 
(a,a) 


Note that 0, = oy if and only if a and b are proportional. 


Og(v) =v—2 a. (10.28) 


Fig. 10.5 Reflection o, 


Exercise 10.13 Show that two nonzero vectors of equal lengths can be swapped 
by appropriate reflection and deduce from this that every linear orthogonal auto- 
morphism of V can be split into the composition of at most dim V reflections in 
hyperplanes. 


If a and b are not proportional, the composition of reflections 0,0, acts identically 
on the subspace at M b+ of codimension 2 and induces a proper isometry in the 
orthogonal complementary 2-dimensional plane spanned by a, b. Such an isometry 
has to be a rotation. Looking how it acts on the basis a, b, we conclude that the 
line a+ is rotated in the direction of the line b+ by the doubled acute angle between 
these lines. 


Example 10.13 (Proper Isometries of 3-Dimensional Euclidean Space) By Exer- 
cise 10.13, a proper isometry F of 3-dimensional Euclidean space V is either the 
identity map Idy or a composition of reflections F = opo, in two different 2- 
dimensional planes a+ and b+. As was explained at the very end of Example 10.12, 
such a composition is the rotation about the line at N b+, which is perpendicular 
to both vectors a, b, in the direction from the plane at to the plane b+ by the 
doubled acute angle between them. Thus, each nonidentical proper linear isometry 
of Euclidean 3-space is a rotation about some line. This fact is known as Euler’s 
theorem. 
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Problems for Independent Solution to Chap. 10 


Problem 10.1 In the standard Euclidean structure on a real coordinate space 
construct an explicit orthonormal basis for (a) the hyperplane x; +x2+---+x, = 0 
in R”, (b) the subspace in R* spanned by the vectors (1,2,2 — 1), (1, 1, —5, 3), 
(3, 2,8,—7), (c) the orthogonal complement to the previous subspace. 


Problem 10.2 Let U C R* be the solution space of the system of linear equations 


2x, +X. + 3x3 —Xx4 = 0, 
3x1 + 2x2 — 2x4 => 0, 
3x, +X. + 9x3 -2x4 = 0. 


Write down a system of linear equations whose solution space is U+. 


Problem 10.3 For given a € R"” and d,,d) € R, find the distance between the 
parallel hyperplanes (a, x) = d; and (a,x) = dz in R". 

Problem 10.4 Given k + 1 distinct points po, pi,...,px € R” that do not lie in 
a common (k — 1)-dimensional affine subspace, describe the locus of all points 
equidistant from all the p;. 


Problem 10.5 Find the maximal number of distinct nonzero vectors in IR” such that 
all the angles between them are obtuse.!° 


Problem 10.6 (Sphere) For given point c € R” and positive number r € R, the 
geometric figure 


Se #fyeR": |ex)/ =r 


is called the (n — 1)-sphere of radius r with center c. Show that every set of n+ 1 
points in R” lie either on a hyperplane or on a unique (n — 1)-sphere. 

Problem 10.7 (Cube) The figure J” © {(x,,x2,...,%») € R" | Vi |x| < 1} is 
called a standard n-cube. 


(a) Draw a plane projection of /* in which all the vertices of /* are distinct. 

(b) Describe a three-dimensional involute of the boundary!” of J+ and explain how 
its 2-dimensional faces should be glued together in order to get the actual 3- 
dimensional boundary of /*. 

(c) For all 0 < k < (n — 1), find the total number of k-dimensional faces of 1”. 

(d) A segment [v,—v] joining two opposite vertices of I” is called an internal 
diagonal. How many internal diagonals are there in /”? 


16Tn R2, for example, there are at most three such vectors; in R°, there are at most four. 


'7Such an involute consists of 3-dimensional cubes in IR? glued somehow along their 2-dimensional 
faces like the flat involute oth of the 2-dimensional surface of the brick /°. 
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(e) How many internal diagonals of 7” are perpendicular to a given internal 
diagonal? 

(f) Calculate the length of an internal diagonal'* in J” and its limit as n — oo. 

(g) Calculate the lengths of segments between the orthogonal projections of 
vertices onto an internal diagonal. 

(h) Calculate the angle between an internal diagonal and an edge of /”. Find its 
limit as n — oo. 

(i) Calculate the angle between an internal diagonal and a face!® of J”. Find its 
limit as n — oo. 

(j) How many mirror hyperplanes”? does J” have? 


Problem 10.8 Give an explicit description of each 3-dimensional polyhedron cut 
out of I* by the 3-dimensional hyperplane x; +.x2 +3 +4 = c asc runs through 
[—4, 4]. 

Problem 10.9 (Simplex) The convex hull?! of the heads of the standard basis 
vectors in R’*!, 


A" 2 {(x0,X1, ---)3n) ER"! | all x; = Oand ) x; = 1}, 


is called the standard n-simplex. Draw A! and A? in the plane. Then: 


(a) Draw some plane projections of A? and A‘ on which all the vertices are distinct. 

(b) Describe some three-dimensional involute of the boundary of A‘ and explain 
how its 2-dimensional faces should be glued to get the actual 3-dimensional 
boundary of A‘. 

(c) For all 0 < k < (n — 1), find the total number of k-dimensional faces of A”. 

(d) Show that there is a unique (n — 1)-sphere touching all faces of A”. Find the 
radius of this sphere and its limit as n > oo. 

(e) Show that there is a unique (n — 1)-sphere containing all vertices of A”. Find 
the radius of this sphere and its limit as n > oo. 

(f) Find the length of the perpendicular drawn from a vertex of A” onto the opposite 
face and compute its limit as n > oo. 

(g) Find the angle between an edge of A” and one of the two faces that does not 
contain this edge; calculate the limit of this angle as n > oo. 

(h) For each | < m < (n — 1), find the distance between nonintersecting m- and 
(n — m— 1)-dimensional faces of A”. 


'8That is, the diameter of the sphere circumscribed about J”. 
9That is, an (n — 1)-dimensional face. 


0That is, hyperplanes e+ C R” such that the reflection o, : R” —> R" sends I” to itself (see 
Example 10.12 on p. 246 for details on reflections). 


*1See Example 6.16 on p. 145. 
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Problem 10.10 Consider the standard 4-simplex ABCDE and write X for the 
midpoint of the segment joining the centers of the 2-dimensional faces ABC and 
CDE. Show that there is a unique line YZ passing through X and intersecting both 


the line AE and the plane BCD at some points Y and Z respectively. Find xy : YZ, 
that is, A € R such that XY =}. YZ. 


Problem 10.11 Give an explicit description of each 3-dimensional polyhedron cut 
out of A* C R° by the hyperplane (a) x; = const, (b) x; + x2 = const. 


Problem 10.12 (Volume of a Simplex) A 1-dimensional echelon pyramid of height 
k consists of I} © k unit segments aligned along the horizontal coordinate axis: 


— 
A 2-dimensional echelon pyramid of height k consists of 
1 #i+ +--+ 
unit squares aligned along the vertical axis: 


k 


STS 


Similarly, an n-dimensional echelon pyramid of height & consists of 
Ty = Og + Wp te + 


unit (n — 1)-cubic bricks stacked in piles along the nth coordinate axis. Calculate 
the total number of these bricks. Send k to infinity and find the ratio between 
the Euclidean volume of an n-dimensional cube and the Euclidean volume of the 
n-dimensional simplex obtained as the convex hull of some vertex v of the cube 
and all vertices joined with v by an edge of the cube. 


Problem 10.13 (Cocube) The convex hull of the centers of the faces of the standard 
cube J” C R" is called the standard n-dimensional cocube and is denoted by 
Cc"; 

(a) Describe C” by an explicit system of linear inequalities. 

(b) For all 0 < k < (n — 1), find the total number of k-dimensional faces of C”. 


(c) Find the length of an edge of C” and its limit as n — oo. 
(d) Find the radius of the (n — 1)-sphere inscribed in C” and its limit as n + oo. 
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Problem 10.14 For k points p,, p2,..., px in affine Euclidean space write Dp, p),....p, 
for the k x k matrix with elements dj = |p;, p;|?, and Cy, », re for the (k + 1) x 


(k + 1) matrix whose elements c,; are numbered by 0 < i,j < k and are equal 


to dj for 1 < < k, whereas cop = O and co; = ci = 1 forall 1 < ij < k. 
As usual, a denotes the Gram matrix of vectors w1,W2,...,Wm-. Prove 
that: 

(a) 2”detG.> = (—1)"*!detC,, p,,..p, (note that the matrices have 


Pop| Popa. * POPn vee 
different sizes). 


(b) n + 1 points po,pi,...,Pn € RR” lie in a hyperplane if and only if 
detCpop),...Pn = 9- 

(c) n+ 2 points po, pi,.--,Pn+1 € R" lie either in a hyperplane or on a sphere if 
and only if detDp,p,,....-1 n= 

(d) The radius r of the sphere circumscribed about the n-simplex pop) ... Dn 
satisfies the equality 2r? = —detDy p,....p,/detCpo p)....pp + 


Problem 10.15 (Orthogonal Polynomials) Check that the following inner prod- 
ucts provide R[x] with a Euclidean structure: 
1 
-1/2 
@(f.e)= | fare (1-2) 
cae 


(©) (f.8) =f. ie? WERE [ Flx)e(x) dx. 


Consider a sequence of 


+00 


dx, (b)(f,g) = f f(g (xe “dx, 


(1) Chebyshev polynomials T,,(x) = goal arccos x), 
(2) Laguerre polynomials L,(x) = e* T (e-*x"), 


(3) Hermite polynomials E,,(x) = e a e*, 
(4) Legendre polynomials P,(x) = = : (1 — x7)", 


For each sequence, indicate the Euclidean structure in which this sequence 
is orthogonal and compare the sequence with the output of Gram—Schmidt 
orthogonalization applied to the standard monomial basis x* in R[x]. 


Problem 10.16*. Find the minimum of ie f° (x) dx over all degree-k monic poly- 
nomials f € R[x]. To begin with, consider k = 2, 3, 4. 


Problem 10.17 In the space of continuous functions [—z,z] — R with inner 
product 


(f.g) = 2 f (x) g(x) dx, 


find the polynomial of degree at most 3 closest to sin x. 
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Problem 10.18 Check that the inner product (A,B) = tr(AB’‘) provides Mat, (R) 
with a Euclidean structure. Describe the orthogonal complements to (a) upper 
triangular matrices,”” (b) symmetric matrices,”> (c) skew-symmetric matrices,”* 
(d) traceless matrices.” 


Problem 10.19 (Adjoint Linear Maps) Show that for every linear map of 
Euclidean vector spaces F : U —> W there exists a unique linear map” 
FY : W = V such that (FY w,u) = (w, Fu) for all w € W and u € U. Check 
that (F; 0 F2)¥ = FY o FY, kerFY = (im F)+, im FY = (ker F)*. For bases 
U = (Uy, U2,...,Un), W = (W1,W2,-..,Wm) of U, W, prove that the matrix FY, 
of FY is expressed in terms of the matrix F,,, of F and Gramians G,, Gy as 
f= Ge Gu 

Problem 10.20 (Normal Linear Endomorphisms) Show that for every Euclidean 
vector space V, the following properties of a linear endomorphism F : V > V 
are equivalent?’: 


(a) FY oF =FoF”, 
(b) Vu EV |FYv| = |Fo, 
(c) Vu,w € V (FYu, FYw) = (Fu, Fw). 


224 matrix C = (ci is called upper triangular if cj = 0 for all i > j. 
23.4 matrix C = (ci 
24. matrix C = (ci is called skew-symmetric, if cj = —c; for all i,j. 
25 matrix C = (ci) is called traceless if trC & ci = 0. 

6It is called the Euclidean adjoint to F. 


°7A linear endomorphism F possessing these properties is called normal. 


) is called symmetric if cy = cj for all i,j. 


Chapter 11 
Projective Spaces 


11.1 Projectivization 


11.1.1) Points and Charts 


Let V be a vector space of dimension (n + 1) over a field k. Besides the (n + 1)- 
dimensional affine space! A’t! = A(V), associated with V is the n-dimensional 
projective space P,, = P(V), called the projectivization of V. By definition, points 
of P(V) are 1-dimensional vector subspaces in V, or equivalently, lines in A(V) 
passing through the origin. To observe such points as usual “dots,” we have to use 
a screen, that is, an n-dimensional affine hyperplane in A(V) that does not pass 
through the origin (see Fig. 11.1). 

Every affine hyperplane of this sort is described by an inhomogeneous linear 
equation (x) = 1, where € € V* is a nonzero linear form on V. We write Us = 
{v € V | (v,&) = 1} for such a screen and call it the affine chart provided by the 
covector &. 


Exercise 11.1 Make certain for yourself that the rule § +> Us establishes a bijection 
between nonzero covectors € € V* and affine hyperplanes in A(V) that do not pass 
through the origin. 


Not all points of P(V) are visible in a chart U;. The complement P(V) ~ Us consists 
of 1-dimensional subspaces spanned by nonzero vectors v € V parallel to Us, i.e., 
lying inside the n-dimensional vector subspace Ann(&) = {v € V| (v,&) =O} in 
V. They form an (n — 1)-projective space P,-; = P(Anné), called the hyperplane 
at infinity” of the chart Uz. The points of P(Anné) can be thought of as directions 
in the affine space U;. Thus, as a set, projective space splits into the disjoint union 


'See Sect. 6.5 on p. 142. 
*Or just the infinity. 
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Fig. 11.1 Projective space 


hyperplane of infinity P(Ann €) 
ie) 
4 


affine chart Uz 


P, = A" U P,-1. Repeating this procedure, we obtain the decomposition P, = 
A" UA"! Ly --- LU A®, where the last element A° = Po is one point. 


Exercise 11.2 Over the finite field of g elements, compute independently the 
cardinalities of P, and A” U A"! LI --- Li) A°. What kind of identity involving q 
do you get? 


11.1.2 Global Homogeneous Coordinates 


Let us fix a basis x9,%1,...,X, in V* and identify V with the coordinate space 
k"*! by the rule v t+ (xo(v), x1(v), ... ,X,(v)). Two nonzero vectors v = 
(X0,X1, ---, Xn) and w = (yo, y1, .--, Yn) produce the same point p € P,, if and only 


if their coordinates are proportional, i.e., x, : x) = yy: yy forallO < wp Av <n, 
where the equalities 0 : x = 0: yandx: 0 = y: 0 are allowed as well. Thus, the 
collection of ratios (xp : x; : +++: X,) matches some point p € P,,. This collection is 
called the homogeneous coordinates of p in the basis x9, x1,...,X, of V*. 
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11.1.3 Local Affine Coordinates 


Let us fix an affine chart Us = {x € A(V) | &(x) = 1} provided by the covector 
— € V*. Choose n linear forms &, &,...,&: € V* such that &, &,&,...,&, forma 
basis in V*, and write eg, e€1,...,@m € V for the dual basis in V. Then eo, €1,..., @m 
form an affine coordinate system* in Ug C A(V) with the origin at e9 € Us and 
the basis €1, €2,..., @m in the vector space Ann(€) C V with which the affine space 
U; is associated. The coordinates of the points in this system are called local affine 
coordinates in U;. Note that they are well defined only within Ug. The local affine 
coordinates of a point p € P,, = P(V) with homogeneous coordinates (xp : x1 : 

- ! X,) are evaluated as follows. Among all proportional vectors representing p 
we pick the vector v = p/&(p) that has &(v) = 1 and therefore lies in U;. Note 
that the equality (p) = 0 means that p ¢ U;. Then we evaluate the linear forms 
&,&,...,&, on this vector. Note that the resulting affine coordinates 


Ei(v) = &i(p)/E(p), 1 <i <n, 
are not linear but linear fractional functions of the homogeneous coordinates. In 


particular, as p runs to infinity, that is, §(p) — 0, the local affine coordinates of p 
actually tend to infinite values. 


(po:p1)=(1:t)=(s:1) 


t=p1/Po 


ZO 


1 y¥c 
v 


Uo: Xo 


Fig. 11.2 Standard affine charts on P; 


3See Sect. 6.5.2 on p. 143. 
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Example 11.1 (The Projective Line) The projective line P; = P(k?) is covered by 
two affine charts Up = U,,, U; = U;,,, that is, by two affine lines defined by the 
equations x) = 1 and x; = 1 (see Fig. 11.2). The chart Up contains all points except 
for one point (0 : 1) represented by the vertical coordinate axis. This is the only 
infinite point for the chart Up. Every other point (xo : x) with x) 4 0 is visible 
in U; as (1: x;/xo). The function t = x;|y, = x1/xp can be used as a local affine 
coordinate within Up. The chart U; consists of points (xg : x1) = (xo/x, : 1) with 
x, # 0, and the function s = xo|y, = xo/x1 can be used as a local affine coordinate 
within U;. The only infinite point for U; is (1 : 0), represented by the horizontal 
coordinate axis. Local affine coordinates s, t of the same point (xo : x1) € Py lying 
within Up N U; are related by s = 1/t. Thus, P; consists of two copies of the affine 
line A! (one line with coordinate s, another with coordinate f) glued together along 
the complements to their origins by the following rule: point s on the first line is 
glued with point t = 1/s on the second. 

For k = R, the result of such gluing can be identified with a circle of diameter 
1 glued from two diametrically opposite tangent lines mapped onto the circle via 
central projections from the opposite points of tangency (see Fig. 11.3). Similarly, 
for k = C, two copies of copies of A! = C glued together by the rule s = 1/s 
produce a sphere of diameter | consisting of two tangent planes drawn through the 
south and north poles and mapped onto the sphere by the central projections from 
opposite poles (see Fig. 11.4). Indeed, if the orientations of the planes are chosen* 
as in Fig. 11.4, then the complex numbers s, t corresponding to the same point of 
the sphere have opposite arguments, and their moduli are inverses of each other by 
Fig. 11.3. 


Ps w/e WwW 


Fig. 11.3 P\(R) ~ S! 


“These orientations will coincide when we overlap the planes by a continuous movement along the 
surface of the sphere. 
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Fig. 11.4 P,(C) ~ S? 


Example 11.2 (Standard Affine Atlas on P,,) For P,, = P(k"*'), the collection of 
(n + 1) affine charts Uy = U,,,0 < v < n,ie., hyperplanes x, = 1 in A”t!, is 
called the standard affine atlas on P,,. For each v, the point e, € U, is taken as the 
origin of the standard affine coordinate system in U,,, and the functions 


- 

V) def L . . 

f? Sxily, = = for O0O<i<n,i¥v, 
v 


are used as the standard affine coordinates. The intersection U,, ™ U, consists of 
all x with both homogeneous coordinates x,, x, nonzero. In the standard affine 


coordinates in U,, (respectively in U,,), the locus of such points is described by the 


inequality i ) # 0 (respectively by the inequality a # 0). The affine coordinates 


of the points r) € U, and t”) € U, representing the same point in P,, are related 
by 


M =1/9 and M=M/@ fori x wv. (11.1) 


The right-hand sides of these equalities are called transition functions from the local 
coordinates 7) in U, to the local coordinates ¢ in U,,. Therefore, P,, is glued 
from (n + 1) disjoint copies Up, U;,...,U, of the affine space A” by the gluing 
rules (11.1). 
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11.2 Polynomials Revisited 


11.2.1 Polynomial Functions on a Vector Space 


Every choice of basis x9,%1,...,X, in the vector space V* dual to an (m + 1)- 
dimensional vector space V allows us to treat each polynomial f € k[xo,%1,...,Xnl 
as a polynomial function f : V — k that takes a vector v € V with coordinates v; = 
x;(v) € k to the result of the evaluation ev(y)»,,..,v,)(f) = f(vo, U1, «+5 Un) € K. 
Thus, we get a homomorphism of k-algebras 


a: klx1,%2,...,m] 2k”, fof, (11.2) 


which maps the polynomial algebra to the algebra of functions V — k. Its image is 
called the algebra of polynomial functions on V. 


Exercise 11.3 Check that the k-subalgebra ima C k” does not depend on the 
choice of basis in V*. 


Proposition 11.1 For an infinite ground field k, the homomorphism (11.2) is 
injective. If k is finite and dim V < oo, then the homomorphism (11.2) is surjective 
and has nonzero kernel. 


Proof The first statement was already established in Exercise 9.6 on p.213. For 
k = F, and V = F%, the functions V — k forma vector space of dimension q” with 
a basis formed by 6-functions 6,, p € k", such that 6(p) = 1 and 6,(q) = 0 for all 


q # P. 
Exercise 11.4 Verify that 5, = f for 


Ili Te¢p; i =) 


f(X1,--+5Xn) = TL lke, 5 Teas, BH" 


where & runs through F, ~ p; in both products. 


Therefore, the homomorphism (11.2) is surjective. Since k[x1, x2,..., xy] is infinite- 
dimensional as a vector space over k, the homomorphism (11.2) has nonzero kernel. 
oO 


11.2.2. Symmetric Algebra of a Vector Space 


It follows from Proposition 11.1 and Exercise 11.3 that for an infinite field k, the 
polynomial algebra k[xo,x1,...,Xn] is isomorphic to the algebra of polynomial 
functions V — k, and the latter does not depend on the choice of variables x; 
forming a basis of V*. This suggests that the polynomial algebra should have some 
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coordinate-free description in intrinsic terms of V*. Such a description is given 
below. 

The space of linear homogeneous polynomials in x9,%1,...,X, coincides with 
V*. Every homogeneous polynomial f(xo,%1, ..., Xn) of higher degree d can be 
written (in many ways) as a finite sum of products a - dz -++ ag, where all a, belong 
to V*. Such products are called commutative monomials of degree d in the covectors 
a; € V*. They certainly are linearly related, because for all a,b € V*, all A, w € k, 
and all monomials m',m” of total degree d — 1, the distributive and commutative 
laws 


m(A-a+ + b)m" —A-n'am"m' — w-m'bm’ =0, 
(11.3) 


m'abm" — m'bam" = 0, 


hold. Let us write V* for the huge? vector space whose basis over k consists of 
all words aja2 ... aq formed by arbitrary d-tuples of elements a, € V*. Thus, 
the vectors of V* are formal finite linear combinations of words ajaz ... aa with 
coefficients in k, and all such words are linearly independent. Write S} C Vj for 
the linear span of all differences from the left-hand sides of the relations (11.3). The 
quotient space S¢V* © V* /S* is called the nth symmetric power of the vector space 
V*. We put S°V* =k and S'V* = V*. The direct sum 


veOorv (11.4) 
d=0 


is called the symmetric algebra of the vector space V*. The multiplication in this 
algebra maps 


sry* x sdy* = gntdy* 


and is defined as follows. Fix a basis word w € V* and define a linear map Ly, : 
Vi —> Vii» by m +> wm for every basic word m € Vj. It sends differences 
from the left-hand sides of (11.3) to the same differences but with wm’ instead of 
m'. Hence, Ly sends S} C Vj into S7,,, C Vjz,,, and therefore produces a well- 
defined map of quotients L,, : S'V* — S¢*™V*. For a finite linear combination of 
words f = }° Ayw, we put Ly = )7A,,Ly and Ly —> ae ies 


Exercise 11.5 Check that Ly = 0 for every f € S*. 


5It is infinite dimensional as soon V is infinite as a set. 
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Thus, Ly depends only on the class of f in the quotient space S’”V*. Therefore, the 
assignment f’, g > L(g) produces a well-defined multiplication map 


sny* x sdy* aN gntdy* 


that is linear in the both arguments f, g. Since commutative monomials are 
multiplied by concatenation, 


by b> eer by + a\ar arias ag = bib2 ey bya\ar »». Ad, 


the algebra SV* is associative. The second equality in (11.3), which holds in the 
quotient space S”*¢Y*, says that the algebra SV* is commutative. Taking a bit of 
liberty, we call elements of SV* polynomials on V, and the elements of S¢V* C SV* 
will be called homogeneous polynomials of degree d. Such terminology is partially 
justified by the next claim. 


Proposition 11.2 [f the covectors xo,X\,...,X, form a basis of V*, then the 


commutative monomials xj°x''| +++ x)" formed from them constitute a basis in SV*. 


Proof By definition, a monomial x” x’”! --- x is a product of mo basic covectors 
01 n 


Xo, m, basic covectors x}, ... , My, basic covectors x,. Since SV* is commutative, 
this monomial is completely determined by the sequence of nonnegative integers 
mo,M,...,Mp. Let us show that the monomials xp°xj'! «++ 7" with )°m; = d 
form a basis in SV*. They certainly span S“V*, because the relations (11.3) allow 
every product ajaz ... dg to be expanded as a linear combination of monomials 
compounded from the x;. To prove their linear independence, it is enough to 
construct a linear form S¢V* -> k that equals 1 on an arbitrarily prescribed 
monomial x5°x}"! «++ x7" and vanishes on all other monomials. By Proposition 7.4 
on p. 163, a linear form on the quotient space S¢V* = vi/S7 is the same as a 
linear form g : Vi — k annihilating S7. By Lemma 7.1 on p. 155, every linear 
form g : Vi — k is the same as a function on the basis of V7 formed by words 
a\a2 ... ag. The latter is nothing but an arbitrary map 


@:V*xV* xe xV*¥ Sk (a1,a,...,4g) PO (aia. ... ag) - (11.5) 
a, 
d 


A linear form g : V7 — k annihilates S7 if and only if it vanishes on the left-hand 
sides of the relations (11.3). In terms of the map (11.5) corresponding to ¢, this 
means that for all a,b € V* and all A, w € k, the relations® 


Q(... Aat pb, ...) =AG(... a...) + UG... db, ...), (11.6) 
@(...,4,b,...)=@(...,b,4,...), 


Dotted things in (11.6) are not changed. 
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hold, i.e., that the map (11.5) is multilinear and symmetric. The first means that @ is 
uniquely determined by its values on all collections of basis vectors, that is, by the 
set of numbers 


def —~ 
Pit in sig = OP (Xi, Xin +s Xig) - 


The multilinear map (11.5) corresponding to such a set takes a d-tuple of vectors 
ay = ya Qy;x;, Where 1 < v < d, to the number 


Q(a1,42,...,aa) = = O61), 2%, °° * Adi, * Pit,in,..,ig € K (11.7) 
id 


Ey iD 5.0051, 


(all summation indices i,, vary independently in the range 0 < i, <n). 
Exercise 11.6 Check that the map (11.7) is multilinear. 


The symmetry property of @ means that the number ¢j, ;,...;, € k remains fixed 


under permutations of indices ij, i2,...,ig, that is, it depends only on the number 
mo of indices 0, the number m, of indices 1, .... , the number m, of indices n 
represented in i, i2,...,ig. Write 

~ ‘ * * * 

Emgmy ..m, iV XV Xs XV ok 


for the symmetric multilinear map that equals 1 on all collections of mp basis vectors 
Xo, m, basis vectors x}, ..., m, basis vectors x,, and vanishes on all the other 
collections of basis vectors. Its associated linear form Emon, ...m, 1 V7 —> k equals 1 
on all words x;,x;, ... Xj, written by mo letters xo, my letters x1, ... , m, letters xX», 
and vanishes on all the other words compounded from the x,. The induced linear 
form on S¢V* = V%/S* sends the monomial xto xf -++ x to 1 and annihilates all 
the other monomials, as required. Oo 


Corollary 11.1 (From the Proof of Proposition 11.2) The space of symmetric 
multilinear maps V* x V* x «++ x V* — k is canonically isomorphic to the space 
a Y 


d 
of linear maps S4V* — k. 


Corollary 11.2 Under the assumptions of Proposition 11.2, an isomorphism of 
k-algebras o : k[xo0,%1,..-,Xn] = SV* is well defined by sending each basic 
monomial xo°x| +++ x" € kk[xo,%1,....Xn] to the basic commutative monomial 


Mo ,M ok 
Xo Xp oe xy E SV". 


Proof Since o bijectively maps a basis to a basis, it is an isomorphism of vector 
spaces. Since o respects multiplication of basis vectors, it is a homomorphism of 
algebras. oO 
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11.2.3 Polynomial Functions on an Affine Space 


For every vector u € V, there is well-defined homomorphism of k-algebras 
d 
evy :SV*>k, aia ...aae |] (v.a), (11.8) 
v=1 


which sends every commutative product of covectors to the product of their 
contractions with the vector v € V. Indeed, we know that the map 


d 
Bp VEXVE Xe xVE kK, aia...aa>]](v.a), (11.9) 
v=1 


is uniquely extended to a linear form V7 — k, which annihilates the subspace 
Sn C V*, because the map (11.9) is symmetric and multilinear. Therefore, it is 
consistently factorized through the linear form (11.8) on the quotient space S¢V* = 
VF Sn. 

If we fix a polynomial f € SV* and let the vector v run through the vector space 
V, then we get a polynomial function 


f:V>k, vef(v) =ev,(f), (11.10) 


which is the polynomial function f considered at the very beginning of Sect. 11.2, 
when we chose a basis x9, X1,...,X, in V* and identified k[xo, x1, ..., Xn] with SV* 
by Corollary 11.2. Thus, the contraction map (11.8) agrees with evaluation of the 
polynomials in coordinates under any choice of coordinates in V. 


11.2.4 Affine Algebraic Varieties 


Every polynomial function V — k can be viewed as a function A(V) — k on 
the affinization of V. Such a function is also called a polynomial. The zero set of a 
nonconstant polynomial function f on A(V) is denoted by 


Z(f) = {p € A(V) | f(p) = 03 


and is called an affine algebraic hypersurface of degree d = deg/f. Intersections of 
algebraic hypersurfaces, i.e., solutions of arbitrary systems of polynomial equations, 
are called affine algebraic varieties. Affine subspaces, which are solutions of 
systems of linear equations, are the simplest examples of algebraic varieties. 
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11.3 Projective Algebraic Varieties 


11.3.1 Homogeneous Equations 


Typically, a nonconstant polynomial f ¢ SV* ~ S°V* does not define a function 
on projective space P(V), because f(v) usually varies as v runs through the 1- 
dimensional subspace located behind a point of P(V). However, for a homogeneous 
polynomial f ¢ S“V* of arbitrary positive degree d € N, the zero set 


Z(f) = {v € PV) | f(v) = 0} (1.11) 


is still well defined, because the equations f(v) = 0 and f(Av) = A“f(v) = 0 are 
equivalent. Geometrically, an affine hypersurface Z(f) C A(V) for homogeneous f 
is ruled by the lines passing through the origin, that is, it consists of the points of 
P(V). The zero set (11.11) considered as a figure within P(V) is called a projective 
algebraic hypersurface of degree d = degf. Let me stress that the notation 
Z(f) in projective geometry always assumes that f is homogeneous of positive 
degree. Intersections of projective hypersufaces, i.e., nonzero solutions of systems 
of homogeneous polynomial equations considered up to proportionality, are called 
projective algebraic varieties. 


Example 11.3 (Projective Subspaces) The simplest examples of projective varieties 
are provided by the projective subspaces P(U) C P(V) associated with vector 
subspaces U C V. They are given by systems of homogeneous linear equations 
E(v) = 0, where & runs through Ann(&) C V* or through some basis of Ann(é). 
For example, for every pair of nonproportional vectors’ a,b € V, there exists a 
unique projective line (ab) C P(V) passing through both a and b. Such a line is the 
projectivization of the 2-dimensional vector subspace spanned by a, b. It consists of 
nontrivial linear combinations Aa + wb, A, 4 € k, considered up to proportionality. 
The ratio (A : 4) can be used as an internal homogeneous coordinate within the 
line (ab). On the other hand, the line (a, b) can be described by linear homogeneous 
equations £(v) = O in the unknown vector v € V forall € € Ann(a)NAnn(b) C V*. 


Exercise 11.7 Show that for every affine chart U; C P, and k-dimensional 
projective subspace K C P,, either K M Us is empty or it is some k-dimensional 
affine subspace in U; . 


It follows from Proposition 6.3 on p. 139 that for every pair of projective subspaces 
K,L C P,, the inequality dim(K N L) > dimK + dimL — n holds. It implies, in 
particular, that every pair of lines in the projective plane intersect. 


7That is, two distinct points of P(V). 
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Fig. 11.5 Real conic 


Example 11.4 (Smooth Real Conic) The second-degree curve C C P, = P(R*) 
given by the equation 


way = 2 (11.12) 


is called a smooth real conic (Fig. 11.5). In the standard chart U;, where x; = 1, in 
local affine coordinates to = xo|y, = X0/%1, t2 = X2\y, = X2/x1, equation (11.12) 


becomes the equation of a hyperbola, - — f, = 1. In the standard chart Uo, 
where x2 = 1, in local coordinates fo = xolu. = X0/x2,t = Milu. = x/ 
X2, it is the equation of a circle, ig ar a = 1. In the nonstandard chart U,,+4,,, 
where x; + x2 = 1, in local affine coordinates t = XolUy, ty = xo/(x1 + x2), 


u= (x — X1)|Ue 42 = (x) — x1)/(x + x1) we get® the parabola fr? = u. Thus, 
ellipse, parabola, and hyperbola are just different affine pieces of the same projective 
conic (11.12). The appearance of C in an affine chart depends on the positional 
relationship between C and the infinite line of the chart. Elliptic, parabolic, and 
hyperbolic shapes appear, respectively, when the infinite line does not intersect C, 
is tangent to C, and crosses C in two distinct points. 


11.3.2 Projective Closure of an Affine Hypersurface 


Consider the affine space A” as the standard affine chart Up within projective space 
P,,. Then for every affine hypersurface X = Z(f) C A” of degree d = degf, 
there exists a projective hypersurface X = Z(f) C P, of the same degree 


’Move x; to the right-hand side and divide both sides by x2 + 1. 
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d = degf = degf such that X N Up = X. It is called the projective closure of X. If 
we expand f as a sum of homogeneous components f = fo+fi tfo+-:: +fa, where 
each f; is homogeneous of degree i, then the homogeneous degree-d polynomial 


Ff (x0, %1, +++ Xn) Shy, + file1,%2,...,%) x4! + ee) + fu(%1,X2,---5Xn) 


is made from f by multiplication of each monomial by an appropriate power of xo, 
completing the total degree of the monomial up to d. The polynomial f turns again 
into f for xo = 1: 


Sf (x1, %2,---,Xn) =f (1, KpeXdq:iics Ky)’ 


For example, the projective closure of the affine plane curve x; = x is given by 
the homogeneous equation x$x; = x3 and has just one infinite point (0: 1 : 0). In 
the standard affine chart Uj, this projective cubic looks like the semicubic parabola 
X6 = x3 with a cusp at (0: 1 : 0). 

In the general case, the complement X ~ X = XN ue is a projective 
hypersurface within the infinitely distant hyperplane x» = 0. In homogeneous 
coordinates (x; : x2 : +++ : X,), It is given by the homogeneous equation 
Ffa(%1,X2,.--,Xn) = 0. In other words, the infinite points of X are the zeros of the 
leading homogeneous form of f considered up to proportionality. In affine geometry, 
they are called the asymptotic directions of the affine variety X = Z(f). 


11.3.3 Space of Hypersurfaces 


Since proportional equations have the same solution set, the projective hypersur- 
faces of degree d in P(V) can be treated as points of projective space P(S“V*), 
which is called the space of degree-d hypersurfaces in P(V). 


Exercise 11.8 Find dim P(S¢V*). 


It should be kept in mind that if the ground field is not algebraically closed, then 
some f € S“V* may determine nothing geometrically reminiscent of a hypersurface. 
For example, consider Z(xj + xt) = @ on P, over R. Even over an algebraically 
closed field, some distinct points f # g in P(S“V*) produce the same zero set 
Z(f) = Z(g) in P(V). For example, the nonproportional polynomials x3x, and 
Xox; define the same two-point set Z(f) = Z(g) = {(1 : 0),(0 : 1)} on Py. 
Nevertheless, these geometric disharmonies can be overcome by passing to the alge- 
braic closure and introducing multiplicities of components. The latter means that 
for f = pips? ++ pr, where pi, po,...,px are different irreducible polynomials, 
we define Z(f) to be the union of k components Z(p;), Z(p2), ... ,Z(px) having 
multiplicities m,,m,..., mg. Thus, in the previous examples, Ze + xi) becomes 


266 11 Projective Spaces 


two points (1 : +i) over C, and Z(xGX1), Z(xoX;) become distinct, because (1 : 0), 
(0 : 1) appear in the first variety with multiplicities 1, 2, whereas in the second, 
they appear with multiplicities 2, 1. However, any strong explicit justification of 
a bijection between points of P(S¢V*) and geometric objects in P(V) would take 
us too far afield. Let us postpone such a discussion to the second volume of this 
textbook. 


11.3.4 Linear Systems of Hypersurfaces 


For fixed p € P(V), the relation f(p) = 0 is linear in f ¢ S“V*. Thus, the degree-d 
hypersurfaces Z(f) C P(V) passing through a given point p form a hyperplane in the 
space of hypersurfaces. A projective subspace within the space of hypersurfaces is 
called a linear system of hypersurfaces. All hypersurfaces in a linear system spanned 
by Z(fi), Z(f2), ... ,»Z( fm) are given by equations of the form 


Afi + Ashe +--+ + Amfin = 0, where A1,A2,...,Am Ek. 
In particular, all these hypersurfaces contain the intersection 


Z(fi) 1 Z(fa) 1 ++» NZ Sm): 


Traditionally, linear systems of dimensions 1, 2, and 3 are called pencils, 
nets, and webs respectively. Since every line in a projective space has nonempty 
intersection with every hyperplane, it follows that for every point, there is a 
hypersurface passing through that point in every pencil of hypersurfaces over an 
arbitrary ground field k. 


Example 11.5 (Pencils of Lines in a Plane) The lines in the projective plane P, = 
P(V) are in bijection with the points of the dual plane PS = P(V*). Every 1- 
dimensional subspace k- & C V* produces the line P(Anné) C P(V), and the linear 
form & is recovered from the line uniquely up to proportionality. Conversely, every 
line in the dual plane P has the form P(Annv) for some v € V, unique up to 
proportionality, and it consists of all lines in P2 passing through the point v € Po. 
Thus, every pencil of lines in P) is the set of all lines passing through some point. 
This point is called the center of the pencil. 


Example 11.6 (Collections of Points in P; and Veronese Curves) Write U for the 
coordinate space k? with coordinates x,x € k2” and consider the projective line 
P,; = P(U). An unordered collection of d points (some of which may coincide) 


Pi,P2,+--,Pa€ Pi, Pv =(Pro: Pv); (11.13) 
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can be viewed as a projective hypersurface, the zero set of the homogeneous degree- 
d polynomial 


d d 
f(%0,%1) = I] det(x, py) = | [Qv.1%0 — Py.oX1) (11.14) 


v=1 v=1 


By analogy with inhomogeneous polynomials f(t) € k[#], whose roots are collec- 
tions of points in Aj, we say that the points (11.13) are the projective roots of the 
homogeneous polynomial (11.14). Note that projective roots are defined only up to 
proportionality. In an affine chart whose infinity differs from all points (11.13), the 
factorization (11.14) is the usual factorization of an inhomogeneous polynomial in 
one variable provided by its roots.? For example, if p,,; 4 0 for all v, then in the 
chart U; with local affine coordinate t = x9/x,, the factorization (11.14) can be 
rewritten as f(t) = const- Ti{_,(t-a), where a, = py,o/pv,1 and const = [],, pv.1, 
and in the chart U, with coordinate s = 1/t, the same factorization becomes 
f(s) = const - Ti4_,d — d,S). 

Every nonzero homogeneous polynomial of degree d in (xo, x;) certainly has at 
most d projective roots on P,. If k is algebraically closed, then there are exactly 
d roots counted with multiplicities, where the multiplicity of the root p means the 
number of factors proportional to det(t, p) in the irreducible factorization (11.14). 
We conclude that over an algebraically closed field k, the unordered configurations 
of d points on Pj, in which some points may coincide, are in bijection with the 
points of the projective space Py = P(S¢U*) of degree-d “hypersurfaces” in P). 

Over a field k, not necessarily algebraically closed, the configurations in which 
all d points coincide form a curve Cy C Py = P(S“U*) called the degree-d Veronese 
curve or rational normal curve of degree d. This curve can be viewed as the image 
of the Veronese map 


very: P(U*) > P(S4U*), gro’, (11.15) 


which sends a linear form g € U* to its dth power g? € S4(U*). Geometrically, 
¢ is a linear equation of a point p = Anng € P, whereas the form g“@ has p as a 
projective root of multiplicity d. 

Write linear forms g € U* as g(x) = ayxy+a1x and degree-d forms f € S“(U*) 
af) = >, (“) - dyxt-’x" Let us use (a : a) and (ay : ay : +++ : ag) as 
homogeneous coordinates in P* = P(U*) and in Py = P(S¢U*) respectively. Then 
the Veronese curve is described by the parametric equation 


(1g + 1) > (ay tay +++ aa) = (af sag tay sag at: +++: af). (11.16) 


See Proposition 3.5 on p. 50. 
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We see that Cy, consists of all points (ag : a; : --- : ag) € Pg whose coordinates 
form a geometric progression. This means that 


e Gs ay a2 ++ aq-2 ) = 

a a2 a3 °*** Ag-| ad 
and it is equivalent to a system of homogeneous quadratic equations aja; = 
aj414;-1, Which certify the vanishing of all 2 x 2 minors of the above matrix. 
Therefore the Veronese curve is in fact an algebraic variety. For example, the 
Veronese conic Cy C P» consists of all quadratic trinomials aoXo + 2a,xox; + ax} 
that are perfect squares of linear forms. It is given by the well-known quadratic 
equation 


ao a 


pj = —aet( ) = «aa =0 (11.17) 


a\ a2 


and admits the parametric description 


do =O,, a =a, ay=ar. (11.18) 


The rational normal curve (11.16) intersects every hyperplane defined by the 
linear equation 


Apdo + Ayia +-+-+Aygay = 0 


precisely in the Veronese images very ((@ : @,)) of the projective roots (aq : @)) € 
P, of the homogeneous polynomial )* A, - aja} of degree d. Since the latter has 
at most d roots in Pj, no collection of d + 1 distinct points on the Veronese curve 
all lie on a hyperplane. Hence, for all k in the range 2 < k < d, every collection of 
k + 1 distinct points on Cy are linearly generic, meaning that there is no (k — 1)- 
dimensional projective subspace that contains them all. In particular, every triple 
of distinct points on Cy is noncollinear for all d > 2. Over an algebraically closed 
field k of zero characteristic, the Veronese curve Cy intersects every hyperplane in 
exactly d points (some of which may coincide). This is the geometric reason for 
saying that Cz has degree d. 


11.4 Complementary Subspaces and Projections 


Projective subspaces K = P(U) and L = P(W) in P, = P(V) are called 
complementary if KL = © and dimK + dimL = n — 1. For example, two 
nonintersecting lines in P3 are always complementary. In the language of vectors, 
the complementarity of P(U) and P(W) in P(V) means that V = U @ W, because 
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of UN V = {0} and 
dimU + dimW = dimK +14+ dimZL+1=(n+1)=dimV. 


In this case, every v € V admits a unique decomposition v = u + w, where u € U, 
w € W.If v € UUW, then both components u, w are nonzero. Geometrically, this 
means that for every point v ¢ K LU L, there exists a unique line £ = (uw) passing 
through v and intersecting both subspaces K, L in some point u, w. Therefore, every 
pair of complementary projective subspaces K,L C P, produces a well-defined 
projection rk : (P, \K) —> L from K onto L, which acts on L identically and sends 
every v € P,,\(KUL) to w = (uw)NL, where (uw) is the unique line passing through 
p and crossing K, L at some points u, w. In terms of homogeneous coordinates 
(xp 2X] 1 +++ 1 X_) such that (xp 2 xy 2 ++ t Xm) and (Xm41 2 Xm42 2... 1 Xp) are 
the coordinates within K and L respectively, the projection z/ just removes the first 
(m + 1) coordinates. 


Fig. 11.6 Projection from p € C onto L 


Example 11.7 (Projection of a Conic onto a Line) In Example 11.4, we considered 
the smooth conic C given in P by the homogeneous equation x4 + a = a Let us 
project C from the point p = (1 : 0: 1) € C on the line L given by the equation 
Xo = 0. In the standard chart U2, where x. = 1, this projection a :C > Lis 
shown in Fig. 11.6. It establishes a birational bijection between the conic and the 
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line, where birationality means that the coordinates of corresponding points gq = 
(go: 41: q2) € C,t = (0: ty : fg) € Lare rational functions of each other: 


(t 2) = (41: (Q2—40)), 


(11.19) 
(qo: 41:4) = ((H- 4): 2th: (+4). 
The projection becomes bijective when we put 7? (p) as the intersection point!® of 
L with the tangent line to C at p. All the lines passing through C are in bijection with 
the points of L, and every such line except for the tangent intersects C in exactly one 
point besides p. 
Note that C can be transformed to the Veronese conic'! aj = adoaz by the 
invertible linear change of coordinates 


ay = x2 + x0, Xo = (ao — az) /2, 
aj=xX\, x1 = a1, 
a2 = X2 —X0, x0 = (ao + a2)/2. 


Under this transformation, the rational parametrization (11.19) becomes the 
parametrization (11.18) of the Veronese conic. 


11.5 Linear Projective Isomorphisms 


11.5.1 Action of a Linear Isomorphism on Projective Space 


Every linear isomorphism of vector spaces F : U => W induces a well-defined 
bijection F : P(U) > P(W) called a linear projective transformation or an 
isomorphism of projective spaces P(U) and P(W). 


Exercise 11.9 For two distinct hyperplanes L;, Ly C P, anda point p ¢ L; U Ly, 
show that the projection from p onto L» establishes an isomorphism of projective 
spaces L; > Lp. 


Theorem 11.1 Let U and W be vector spaces of the same dimension dimU = 
dim W = n + 1. Given two ordered collections of n + 2 points po, P\,.--,Pn+1 © 
P(U), qo,q1,--->dn+1 € P(W) such that no n + 1 points of each collection lie 
within a hyperplane, there exists a unique, up to proportionality, linear isomorphism 
F :U = W such that F(p;) = q; for all i. 


'0In Fig. 11.6, the tangent line through p crosses L at the point (0 : 1 : 0) lying at infinity. 
See formula (11.17) on p. 268. 
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Proof Fix nonzero vectors u;, w; representing points p;, g; and use uo, W4,..., Un 
and Wo, W1,..-,Wn as bases for U and W. The projective transformation F: 
P(U) — P(W) induced by the linear map F : U — W takes po,pi,...,Pn to 
qo, 41,---»9n if and only if the matrix of F in our bases is nondegenerate diagonal. 
Write Ao, A1,...,A, for its diagonal elements and consider the remaining vectors 
Unt] = Xoo + XU, +++ + Xp_Uy aNd Watt = Yowo + yiwi +++ + YaWn. The 
condition F(uj+1) = An+iWn+1 is equivalent to n + 1 equalities y; = An41A;xi, 
0 <i <n, where each x; is nonzero, because otherwise, n + 1 points p, with v £ i 
would lie in the hyperplane x; = 0. For the same reason, each y; is also nonzero. 
Thus, the diagonal elements (Ao, 41, ..., An) = Act (91 /X1, Y2/X2, --- s¥n/Xn) 
are nonzero and unique up to a constant factor ane #0. Oo 


11.5.2 Linear Projective Group 


The linear projective automorphisms of P(V) form a transformation group whose 
elements are linear automorphisms of V considered up to proportionality. This group 
is denoted by PGL(V) and called the linear projective group. For the coordinate 
space V = k"*!, this group consists of proportionality classes of invertible square 
matrices and is denoted by PGL,,+ ;(k). 


Example 11.8 (Linear Fractional Transformations of a Line) The group PGL2(k) 
consists of nondegenerate 2 x 2 matrices A = 6 ’) considered up to proportion- 
c 


ality. Such a matrix acts on the coordinate projective line P} = P(k?) by the rule 
(Xo 2X1) ( (axo + bx,) : (cxp + dx1) ). In the standard chart U,; ~ A! with affine 
coordinate tf = xo/x, this action looks like the linear fractional transformation 


at+b 
be é 
ct+d 


This affine notation makes it obvious that proportional matrices produce the same 
transformation and that for every ordered triple of distinct points q, r,s, there exists a 
unique linear fractional transformation sending those points to oo, 0, | respectively. 
It acts by the rule 


(11.20) 
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11.6 Cross Ratio 


The difference of affine coordinates a = ag/a;, b = bo/b, of points a = (dg : ay), 
b = (by : bi) on P; = P(K’) up to a nonzero constant factor coincides with the 
determinant of the matrix of homogeneous coordinates of the points: 


b ao bo agb, — aybo det(a, b) 
a-— = = SS — ee SS 
a\ by a,b, a,b, 


For every ordered quadruple of distinct points p1, po, p3, pa € Pi, the quantity 


a (P1—P3)(P2 — pa) _ det (pi, ps) - det (p2, pa) 
(p1 — pa) (p2—p3) — det (pi, p4) « det (p2, p3) 


is called the cross ratio of the quadruple of points. Geometrically, the cross ratio 
[P1. P2, P3. pa] considered as a point on P,; = P(k?) coincides with the image of 
pa under the linear fractional transformation (11.20) sending p1, p2, p3 to oo, 0, 1. 
Therefore, the cross ratio can take any value except for oo, 0, 1, and one ordered 
quadruple of points can be transformed to another by a linear fractional map if and 
only if these quadruples have equal cross ratios. 


[P1. P2, P3, pa] (11.21) 


Exercise 11.10 Verify the latter statement. 


Since an invertible linear change of homogeneous coordinates on P, is nothing but 
a projective automorphism of P), the right-hand side of (11.21) does not depend on 
the choice of homogeneous coordinates, and the middle part of (11.21), in which 
the cross ratio is expressed in terms of the differences of affine coordinates, depends 
neither on the choice of affine chart nor on the choice of local affine coordinate 
within the chart, assuming that all four points belong to the chart. '* 


11.6.1 Action of the Permutation Group S4 


Let us elucidate the behavior of the cross ratio under permutations of points. 
It is clear from (11.21) that a simultaneous transposition of two disjoint pairs 
of points does not change the cross ratio: [p1, p2, p3, ps] = [p2, Pi, Pa, P3] = 
[P3, P4, P2, Pil = [P4, Ps, P2, pi]. Denote this quantity by %. It is not hard to check 


!2That is, all numbers P1; P2; P3, pa are finite. 
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that 24 permutations of points change ¥ as follows: 
[P1, P2, P3, Pal = [P2, Pi. Ps, P3) = [P3. Pa, P2, Pil = [P4, Ps, Pr, Pi] = 8 


1 
[P2, Pi, P3, Pa] = [P1. Pr. pa, P3] = [p3. Pa. Pi. P2] = [p4, P3. Pi, P2] = ~ 


v 
[p3. Po, Pi, Pal = [P2, Ps, Ps, Pil = [Pi, Pa, Pr, Ps] = [Ps, Pi. P2 P3] = F—- 
[P4, P2, P3, Pi] = [P2. pa. Pi. P3] = [p3. Pi. Po, pal = [pP1. pz. p2, pal = 1 —V, 
v-1 
[P2, P3, P1, Pal = [P3, P2, Pa, Pil = [P1, Pas P3> P2) = [Pa, Pi, P3, P2] = “a 
1 
[P3. Pi, P2. Pal = [P1, ps. pa, po] = [p2, ps, Pi. p3] = [P4, P2. Pi, P3] = To?’ 
(11.22) 


where we put in rows all permutations obtained by transposing disjoint pairs, 
respectively, in the identity permutation, in the transpositions 012, 013, 014, and in 
the cyclic permutations t = (2,3, 1, 4), t! = @G,1,2, 4). 


Exercise 11.11 Verify all the formulas (11.22). In particular, check that all 24 
permutations of S4 are listed there. 


11.6.2 Special Quadruples of Points 


It follows from formulas (11.22) that there are three special values 7 = —1, 2, 1/2 
unchanged by the transpositions 0), 013, 014 and determined by the relations 


1 v 
cv=—, P= ——, B=1-39. 
v v-1 
The cycles (2, 3, 1, 4), (3, 1, 2, 4) permute these three special values cyclically. Also, 
there are two special values of % preserved by the cycles (2,3,1,4), (3,1, 2,4). 
They satisfy the quadratic equation 


2 v—1 1 
v—-0+4+1=0 = v= ad a em 
and are equal to the cube roots of unity different from —1, if they exist in k. The 
transpositions 012, 013, 014 swap these special values of 7. 
We say that an ordered quadruple of collinear points is special if its cross ratio 
equals one of the five special values of # listed above. When we permute the 
points of a special quadruple, their cross ratio varies through either three or two 
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different values. If a quadruple is not special, then its cross ratio takes six distinct 
values (11.22) under permutations of points. 


11.6.3 Harmonic Pairs of Points 


Four collinear points a, b, c, d are called harmonic if 
la,b,c,d] =—-1. (11.23) 


In this case, we also say that unordered pairs {a,b} and {c,d} are harmonic to 
each other. Such terminology is justified by Sect. 11.6.2, where we have seen that 
harmonicity is equivalent to invariance of the cross ratio either under swapping the 
points in any one of these pairs or under transposing the pairs with each other. Thus, 
harmonicity is a symmetric binary relation on the set of unordered pairs of points 
in P,. Geometrically, pairs {a, b} and {c, d} are harmonic if and only if in the affine 
chart with infinity at a, the midpoint of [c, d] goes to b, meaning that c— b = b—d. 


Example 11.9 (Complete Quadrangle) Associated with a quadruple of points a, b, 
c, d € P» such that no three of them are collinear is a geometric configuration 
formed by three pairs of lines spanned by disjoint pairs of points (see Fig. 11.7). 


Fig. 11.7 Complete 
quadrangle 
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The lines in each pair are called opposite sides of the quadrangle abcd. The points 
a,b, c,d are called the vertices of the quadrangle. The opposite sides intersect at a 
triple of points 


x = (ab) N (cd), 
y = (ac) N (bd), (11.24) 
z= (ab) N (cd). 


Three lines joining these points form an associated triangle of the quadrangle abcd. 
We claim that in the pencils of lines!* centered at x, y, z, two sides of the quadrangle 
are harmonic to two sides of the associated triangle. For the proof, let us identify 
the pencil of lines passing through x with lines (a, d) and (b,c) by taking the line 
€ > xto €N (a,d) and £ N (b,c) respectively. Write x’ € (a,d) and x” € (b,c) for 
the points corresponding to (x, y) under this identification. We have to check that 


la, d,z,x’] = [b,c,z,x’] = —1. Since the projections from x and from y establish 
linear projective isomorphisms between the lines (a, d) and (b, c), we conclude that 
[a,d,z,x’] = [b,c,z,x’] = [d,a,z,x’]. Since the cross ratio remains unchanged 


under swapping the first two points, it equals —1. 


Problems for Independent Solution to Chap. 11 


Problem 11.1 Over the finite field of g elements, find the total number of k- 
dimensional (a) projective subspaces in P,,, (b) affine subspaces in A”. 

Problem 11.2 Formulate a geometric condition on a triple of lines fo, €;, £2 in 
P, = P(V) necessary and sufficient for the existence of a basis xo, x), x2 in 
V* such that each £; becomes the line at infinity for the standard affine chart 
U; = {(xo : X1 : X2) | x; 4 0} associated with this basis. 

Problem 11.3 Choose points A, B, C € P2 such that the points 


A'=(1:0:0), B’=(0:1:0), C=(:0:1 


lie, respectively, on the lines (BC), (CA), (AB) and the three lines (AA’), (BB’), 
(CC’) all pass through (1: 1: 1). 

Problem 11.4 Let the subset ® C P,, = P(V) be visible as a k-dimensional affine 
subspace in every affine chart where it is visible at all (we assume that k < n does 
not depend on the chart). Is it true that ® = P(W) for some (k + 1)-dimensional 
vector subspace W C V? 


'3See Example 11.5 on p. 266. 
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Problem 11.5 Consider the following plane curves given by affine equations in the 
standard chart Up C Po: (a)y = x’, (b)y = x°, (cy? + («— 1)? = 1, dy’ = 
x(x + 1). Write their affine equations in two other standard charts U;, Uz and 
draw all 12 affine curves you deal with. 


Problem 11.6 Consider the Euclidean plane R? as a set of real points within the 
standard affine chart Uj ~ C? in the complex projective plane CP2 = P(C?). 


(a) Find two pints /+,/~ € CP> lying on all degree-2 curves visible within R? as 
Euclidean circles. 

(b) Let acurve C of degree 2 on CP pass through both points 7, and have at least 
three noncollinear points within R?. Prove that CN R? is a Euclidean circle. 


Problem 11.7 In the notation of Example 11.7, show that (qo, q1,q2) given by 
formula (11.19) on p.270 as (ft), f.) runs through Z x Z enumerate all the 
proportionality classes of integer solutions of the Pythagorean equation gj + Gq = 
D- 

Problem 11.8 A nonidentical projective automorphism ¢ is called an involution if 
gy” = Id. Show that each involution of the projective line over an algebraically 
closed field has exactly two distinct fixed points. 


Problem 11.9 (Projective Duality) Projective spaces P,, = P(V) and P* = P(V*) 
are called dual. Show that each of them is the space of hyperplanes in the 
other. Prove that for each k in the range 0 < k < n— 1, the assignment 
P(W) < P(AnnW) establishes a bijection between k-dimensional projective 
subspaces in P,, and (n — k — 1)—dimensional projective subspaces in P*. Verify 
that this bijection reverses the inclusions of subspaces and takes a subspace 
H C P, to the locus of all hyperplanes passing through H. 


Problem 11.10 (Pappus’s Theorem) Given two lines 2; 4 £2 on P2 and two triples 
of different points a;,b,,c,; € &; ~ €2 and ao, bz, co € £2 ~ £1, show that the three 
points (a,b2) M (a2bq), (bic2) N (b2€1), (C142) N (c2a1) are collinear. 

Problem 11.11 Formulate and prove a projectively dual version of Pappus’s 
theorem. !* 


Problem 11.12 (Desargues’s Theorem I) Given two triangles A,B,C; and A2B2C 
on P>, show that three points (A,B) N (A2B2), (B, C\) N (B.C), (C\A1) N (CyA2) 
are collinear if and only if the three lines (A;A2), (B} Bz), (C, C2) are concurrent.'> 
(Triangles possessing these properties are called perspective.) 


Problem 11.13 (Desargues’s Theorem II) Given three distinct points p, g, r on 
a line £ on P» and three distinct points a, b, c outside the line, prove that the 
lines (ap), (bq), (cr) are concurrent if and only if there exists a linear projective 


'4That is, a statement about the annihilators of points and lines from Pappus’s theorem that holds 
in PS. It could start thus: “Given two distinct points and two triples of concurrent lines intersecting 
in these points. ...” 

'5This is, lie in the same pencil. 
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involution of the line £ exchanging p, q, r with the points €M (bc), £ 1 (ca), 
£1 (ab), preserving the order. 


Problem 11.14 Describe all linear fractional automorphisms t +> (at + b)/(ct + d) 
(a) preserving oo, (b) preserving both 0 and on, (c) preserving | and swapping 0 
and oo, (d) preserving 0 and swapping | and oo, (e) preserving oo and swapping 
0 and 1. 


Problem 11.15 Use the previous problem to obtain without any computations the 
equalities 


[P2. Pi. Ps, Ps] = [P1. P2, P3 Pal. [P1. Ps, P2» Pal = 1 —[P1. pr. Ps. pal. 
[P1, Pa, P3, P2| = ([P1, P2, Ps, Pa] — 1) /[P1, P2, Ps, Pal- 


Problem 11.16 Prove that for five distinct points p;,p2,...,ps € P,, one always 
has the equality 


[P1, P2, P3, Pal - [P1, P2, Pa, Ps) * [P1, P2, Ps, P3] = 1. 


Problem 11.17 Prove that for eight distinct points p;, p2, P3, P4; 91; 92; 93, 94 € 
P,, one always has the equality 


[P1, P2, 93, 94]: [P2. ps, G1. 94] - [P3, Pi. G2, Gal 
-[G1, 92, P3, Pal: (G2. 93, Pi, Pal (93. 91, P2, Pa] = 1. 


Problem 11.18 Write U for the space of homogeneous linear forms in fo, ft}, and 
P3 = P(S*U) for the projectivization of the space of homogeneous cubic forms 
in fo, t). Describe the projections of the Veronese cubic!® C; C P3: 


(a) from the point 2 in the plane spanned by 3 1f, 3 tof?, ff ; 
(b) from the point 3 #31, in the plane spanned by f, 3 tof7, # ; 
(c) from point f3 + # in the plane spanned by #8, 3 1, 3 tol; - 


More precisely, write down an explicit homogeneous equation for each target 
curve and draw it in all three standard affine charts in the target plane. 


Problem 11.19 (Rational Normal Curves) In the notation of Example 11.6 on 
p. 266, prove that the following curves C in Py = P(S“U) are transformed to 
each other by an appropriate linear projective automorphisms of Py: 


(a) C is the image of the Veronese map verg : P(U) > P(S¢U), 9 & ¢%; 
(b) C is the image of an arbitrary map F : P(U) > P(S“U) given in homogeneous 
coordinates by (a : a1)  (fo(a): fila): -+- : fu(o)), where f;(a) are any 


16See Example 11.6 on p. 266. 
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linearly independent homogeneous degree-d polynomials in a = (a, a1); 
(c) Cis the image of the map P; — P, given in homogeneous coordinates by 


1 1 1 
Qo :A,) > | — 5: ——_ E- 2 — — ] 
(0 + a) (isos det( pi, @) —! 


where po, P1,-..,Pa € FP) are arbitrarily fixed distinct points and det(a,b) = 
agb at a\bo fora = (ao : a),b = (bo : by) € P). 

(d) Fix d + 3 points pj, p2,...,pa, a, b, c € Pg such that no (d + 1) of them lie 
within a hyperplane. Write ¢; ~ P; C P? for a pencil of hypersurfaces passing 
through the (n — 2)-dimensional subspace spanned by n — | points p, with 
v Ai. Let py : €; = €; be the linear projective isomorphism that sends the three 
hyperplanes of £; passing through a, b, c to the three hyperplanes of ¢; passing 
through a, b, c respectively. The curve C is the locus of the intersection points 
of the n-tuples of the corresponding hyperplanes: 


C= YAN va) + 0). 


Hel 


Problem 11.20 Show that for any n + 3 points in P,, such that no (n + 1) of them 
lie within a hyperplane, there exists a unique rational normal curve C C P,, from 
the previous problem that passes through all n + 3 given points. 


Chapter 12 
Groups 


12.1 Definition and First Examples 


A set Gis called a group if it is equipped with a binary operation 
GeG>G, (81:82) sig2; 


called composition, that satisfies the following three properties: 


associativity: Vf,g,heG (fg)h=f(gh), (12.1) 
existence ofunit. JeeG: VgEeG, eg=g, (12.2) 
existence of inverse elements: VgéG Ag !eG: glg=e. (12.3) 


For example, every multiplicative abelian group is a group satisfying one extra 
relation, commutativity: fg = gf for all f, g € G. 

A map of groups g : S — T is called a group homomorphism if g(gh) = 
y(g)p(h) for all g,h € S. In the context of groups, the terms monomorphism, 
epimorphism, and isomorphism mean a group homomorphism that respectively is 
injective, surjective, and bijective. 


Exercise 12.1 Check that the composition of homomorphisms is a homomorphism. 


The conditions (12.1)-(12.3) immediately imply some other expected properties 
of composition. The element g~! in (12.3) that is a left inverse of g is automatically 
a right inverse. This follows from the equalities g-!gg-! = eg"! = g™!. Left 
multiplication of the left and right sides by the element that is a left inverse to g~! 
leads to geg~! = e. The element g~! that is the inverse of g is uniquely determined 


‘Sometimes also, depending on context, called multiplication, and sometimes, in the case of an 
abelian group, addition and denoted by +. 
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by g, because the equalities fg = e and gh = e imply f = fe = f(gh) = (fg)h = 
eh=h. 


Exercise 12.2 Check that (g122 «+» g) | = g;! «++ gy'g7l. 


The left unit e in (12.2) is also the right unit, because ge = g(g~'g) = (gg7!)g = 
eg = g. This implies the uniqueness of the unit, because for two units e’, e”, the 
equalities e’ = e’e” = e” hold. 

For a finite group G, we write |G| for the cardinality of G and call it the order 
of G. A subset H C Gis called a subgroup of G if H is a group with respect to the 
composition in G. The intersection of any set of subgroups is clearly a subgroup. 


Exercise 12.3 Verify that H C G is a subgroup if and only if 4\hz! € H for all 
hy, ho eH. 


Example 12.1 (Transformation Groups) The main motivating examples of groups 
are the transformation groups. We have already met many such groups. The bijective 
endomorphisms X => X of a set X form a group, denoted by Aut X and called the 
automorphism group of X. The automorphism group of a finite set {1, 2, ... ,n} 
is the symmetric group Sy. Its order is given by |S,| = n!. The even permutations 
form a subgroup A, C S, of order |A,| = n!/2. A transformation group of X 
is any subgroup G C AutX. If a set X is equipped with some extra algebraic or 
geometric structure, then the set of all automorphisms of X respecting that structure 
forms a transformation group G C AutX. In this way, we get linear groups acting 
on a vector space V: the general linear group GL(V) of all linear automorphisms; 
the special linear group SL(V) of volume-preserving linear automorphisms; the 
orthogonal group O(V) of isometries of a Euclidean vector space V and its special 
subgroup SO(V) = O(V) M SL(V); the projective linear group PGL(V) of linear 
projective automorphisms of projective space P(V); etc. If X = G is itself a group, 
we write Aut G for the group of group automorphisms of G. 


Exercise 12.4 Let G be the additive residue group Z/(p) for a prime p € N. 
Describe Aut G. 


For a transformation g of a set X and an element x € X, we will shorten the notation 
g(x) to gx. 


12.2 Cycles 


12.2.1 Cyclic Subgroups 


For a group G and element g € G, we write (g) for the smallest subgroup in G 
containing g. It is called the cyclic subgroup spanned by g, because it is formed by 
all integer powers g’, where we put g? © e and g" = ( gy as usual. Therefore, 
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the group (g) is abelian, and there is a surjective homomorphism of abelian groups 
Ye: Z—> (g), mr g™, 


which takes addition to multiplication. If kerg, = 0, then (g) ~ Z, and all integer 
powers g” are distinct. In this case, we say that g has infinite order and write ord g = 
oo. If kerg, A 0, then ker y, = (n) and (g) ~ Z/(n), where n € N is the smallest 
positive exponent such that g” = e. This exponent is called the order of the element 
g and is denoted by ord(g). Note that the order of an element equals the order of the 
cyclic subgroup spanned by that element. If ord g = n, then all distinct elements of 
(g) are exhausted by e = g°, g=g!, g*,... ,g" 1. 


12.2.2 Cyclic Groups 


An abstract group G is called cyclic? if G = (g) for some g € G. Such an element 
g is called a generator of the cyclic group G. For example, the additive group of 
integers Z is cyclic and has two generators, | and —1. We have seen in Theorem 3.2 
on p. 63 that every finite multiplicative subgroup of a field is cyclic. The additive 
group of residue classes Z/(n) is also cyclic and usually has many generators. For 
example, Z/(10) is generated by each of the four classes [+1]¢, [--3]¢ and by none 
of the six remaining classes. 


Lemma 12.1 An element h = g* generates the cyclic group (g) of order n if and 
only if GCD(k,n) = 1. 


Proof Since (h) C (g), the coincidence (h) = (g) is equivalent to the inequality 
ordh > n. The equality h” = g’”* = e holds if and only if n | mk. If GCD(n, k) = 1, 
then n | mk only ifn | m, which forces ordh = n. If n = nid andk = kd for some 
integer d > 1, then h™ = g'" = gi = e, that is, ordh < n <n. oO 


12.2.3 Cyclic Type of Permutation 


A permutation t € S,, is called a cycle* of length m if it acts on some m elements* 
ij, 12,...5lm € {1, Die sexe ,n} as 


I PPbPR et: RPin-t Pin P (12.4) 


>See Sect. 3.6.1 on p. 62. 
3Or a cyclic permutation. 
4Not necessarily sequential or increasing. 
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and leaves all the other elements fixed. In this case, we write tT = |ij,i2, ... ,im). 
The notation is indicative but not quite correct: the same cycle (12.4) permits m 
distinct expressions |i, i2, ... , im) obtained from each other by cyclic permutations 
of the indices. 


Exercise 12.5 How many different cycles of length m are there in S,, ? 
Exercise 12.6 Check that |i), i2, ... , ia is a cycle if and only if GCD(k, m) = 1. 


Theorem 12.1 Every permutation is a composition of disjoint (and therefore 
commuting) cycles. Such a decomposition is unique up to permutation of the factors. 


Proof Since the set X = {1, 2, ... , n} is finite, for every x € X and g € S,, there 


vas : oe g 8 4» 8 3 8 
must be repetitions in the infinite sequence x #> gx K> g°x > BPX K> 
Since g is bijective, the leftmost repeated element is x. Therefore, under iterations 
of g, each point x € X goes through some cycle. Two such cycles beginning from 
different points x, y either coincide or do not intersect each other at all, because g is 
bijective. Oo 


Exercise 12.7 Show that two cycles t,, t2 € S, commute in exactly two cases: (1) 
they are disjoint; (2) they are of the same length m and t; = t; for some s coprime 
to m. 


Definition 12.1 (Cyclic Type of Permutation) Let a permutation g € S, be a 
composition of disjoint cycles t, t2,...,t; numbered in nonincreasing order of 
their lengths 4; > Ay > --- > A, including all cycles of length 1 (corresponding 
to fixed elements of g). The Young diagram A(g) formed by the rows of lengths 
A1,A2,...,Ax is called a cyclic type of the permutation g. Note that this Young 
diagram consists of |A| = n cells. 


Example 12.2 The permutation g = (6,5, 4, 1, 8, 3,9, 2,7) € So can be 
decomposed into disjoint cycles as 


g =|1,6,3,4)|2,5,8)|7,9) andhas A(g) =H. 


If we write the contents of the cycles in the rows of a Young diagram, we see 
that there may be different permutations of the same cyclic type. For example, the 


permutations 
Lif 6 [3] 4] , ita 
g=2]/5[8| and g =[2[5[8| 
[7] 9| Li [6] 


certainly are different. There are altogether (n — 1)! different maximal cycles of 
length n, which have cyclic type A = (n) (one row of width n). The only permutation 
of cyclic type A = 1" = (1, 1, ... , 1) (one column of height 7) is the identity. 
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Exercise 12.8 Write m; = m;(A) for the number of rows of length i in the Young 
diagram A. Therefore, 0 < m; < |A| for each i and }°,i-m; = |A| = n. Show that 
the total number of distinct permutations of cyclic type A in S, is equal? to 


n} 


I; imi. m;! : 


Example 12.3 (How to Compute Order and Sign) The order of a permutation of 
cyclic type A = (Ay, A2,...,A5) is equal to LCM(A1, Ao, ..., As). For example, the 
permutation 


(3, 12, 7, 9, 10, 4, 11, 1, 6, 2, 8, 5) 
= |1, 3, 7, 11, 8) |2, 12, 5, 10) |4, 9, 6) € So 


has order 5 - 4- 3 = 60. The thread rule® shows that the sign of a cycle of length m 
equals (—1)—!. Hence a permutation is even if and only if it has an even number of 
even-length cycles. 


Exercise 12.9 Calculate sgn(g) and g!> for g = (6, 5, 4, 1, 8, 3, 9, 2, 7) € So. 


12.3 Groups of Figures 


For each figure ® in Euclidean space R”, the bijective maps ® — ® induced by the 
orthogonal’ linear maps F : R" — R" such that F(®) = ® form a transformation 
group of ®. This group is called the complete group of the figure ® and is denoted 
by Og. The subgroup SOs C Oe formed from the bijections induced by the proper® 
orthogonal maps R” — R” is called the proper group of the figure. If ® lies within 
some hyperplane IT C R", then the complete and proper groups coincide, because 
every nonproper isometry F composed with an orthogonal reflection in IT becomes 
proper but has the same restriction on I as F. 


Exercise 12.10 Make models of the five Platonic solids: tetrahedron, octahedron, 
cube, dodecahedron, and icosahedron.? 


>Compare with formula (1.11) on p.7. 

See Example 9.2 on p. 209. 

7See Sect. 10.5 on p. 244. 

’That is, orientation-preserving (see Sect. 10.5 on p. 244). 
See Figs. 12.5, 12.6, 12.7, 12.8 on p. 288. 
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Fig. 12.1 The dihedral group G B) 
Dy 


Example 12.4 (Dihedral Groups D,,) Consider a regular plane n-gon placed within 
R? in such a way that its center is at the origin. The group of such an n-gon is 
denoted!® by D,, and called the nth dihedral group. The simplest dihedron has n = 
2 and looks like an oblong lune with two vertices, two edges, and two faces as 
shown in Fig. 12.1. The group D2 of such a lune is the same as the group of a 
circumscribed rectangle or the group of an inscribed rhombus, assuming that both 
are not squares. The group D> consists of the identity map and three rotations by 
180° about mutually perpendicular axes: one joins the vertices, another joins the 
midpoints of edges, and the third is perpendicular to the plane of the dihedron and 
passes through its center. The group D2 is also known as the Klein four group and 
is often denoted by V4. 


Exercise 12.11 Verify that D) ~ Z/(2) 6 Z/(2). 


Fig. 12.2. Group of a triangle 1 


O12 
ie 


023 


10TIn many textbooks, this group is denoted by D ,, to emphasize its order, but we prefer to stress 
its geometric origin. 
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Next, form = 3, we get the group of the triangle D3 of order 6. It consists of the 
identity map, two rotations t, t~! by +120° about the center, and three reflections! 
o;; in the medians of the triangle (see Fig. 12.2). Since an isometry R? +> R? is 
uniquely determined by its action on the vertices of an equilateral triangle centered 
at the origin, the triangle group D3 is isomorphic to the permutation group $3 of 
the vertices. Under this isomorphism, rotations by +120° are identified with the 
mutually inverse cycles (2, 3, 1), (3, 1, 2), while reflections are identified with the 
transpositions 023 = (1, 3, 2), 013 = (3, 2, 1), om = (2, 1, 3). 


SI74 
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Fig. 12.3 Dihedral reflection axes for D4, Ds, Dg 


Since an isometry R* — R? is uniquely determined by its action on the affine 
coordinate system formed by some vertex and two basis vectors drawn along the 
outgoing edges from this vertex, the dihedral group D, has order |D,,| = 2n for all 
n = 2: the chosen vertex can be mapped to any one of n vertices, whereupon we have 
two ways of mapping the basis vectors. The resulting 2n maps are exhausted by the 
n rotations about the center of the dihedron by angles 27k/n, whereO <k <n—1 
(for k = 0 we get the identity map), and n reflections!” in the lines joining a vertex 


'lWhich also may be treated as rotations by 180° about the medians. 
'20r rotations by 180° in the space R?. 


286 12 Groups 


with the midpoint of the opposite edge for odd n and joining either opposite vertices 
or midpoints of opposite edges for even n (see Fig. 12.3). 


Exercise 12.12 Write multiplication tables for the groups D3, D4, Ds similar to the 
one presented in formula (1.20) on p. 11. 


Example 12.5 (Groups of the Tetrahedron) Consider a regular tetrahedron centered 
at the origin of R>. Since an isometry of R? is uniquely determined by its action on 
the vertices of the tetrahedron and this action can be chosen arbitrarily, the complete 
group of the regular tetrahedron Oje is isomorphic to the permutation group S4 of 
the vertices. In particular, |Oj.| = |S4| = 24. The proper tetrahedral group SOrer 
consists of 12 = 4. 3 transformations. Indeed, a rotation of the tetrahedron is 
uniquely determined by its action on the affine coordinate system formed by a vertex 
and the outgoing three edges from it. The vertex can be mapped to any one of four 
vertices, whereupon we have exactly three possibilities for an orientation-preserving 
superposition of edges. The resulting 12 rotations are the identity, 4-2 = 8 rotations 
by +120° about the axes joining a vertex with the center of the opposite face, and 
three rotations by 180° about the axes joining the midpoints of opposite edges (see 
Fig. 12.4). 


Fig. 12.4 Reflection plane 
012 and axis of 180° rotation 
012034 


The complete tetrahedral group consists of the 12 rotations just listed, 6 reflections 
o; in the planes passing through the midpoint of the edge [i,j] and the opposite 
edge, and 6 more improper transformations corresponding to 6 cyclic permutations 
of vertices: |1234), |1243), |1324) , |1342), |1423), |1432). Any such 4-cycle can 
be realized as a rotation by +90° about the axis joining the midpoints of opposite 
edges followed by a reflection in the plane perpendicular to the axis and passing 
through the center of the tetrahedron. 
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Exercise 12.13 Write each 4-cycle as a composition of reflections oj. 


Exercise 12.14 Check that the isomorphism Oj = S4 takes reflections oj to 
transpositions |ij); rotations by +120° (i.e., compositions ojo, for different i, j, 
k) to cycles |ijk); rotations by +180° to three pairs of simultaneous transpositions 
in disjoint pairs 012034 = (2, 1,4, 3), 013024 = (3, 4,1, 2), 014023 = (4, 3,2, 1) 
of cyclic type . Convince yourself that the latter triple of involutions together 
with the identity forms the Klein four group V4 ~ Do. 


Example 12.6 (Groups of the Dodecahedron) As in the previous example, each 
rotation of a regular dodecahedron is uniquely determined by its action on the affine 
coordinate system formed by some vertex and the triple of outgoing edges. We can 
move the vertex to any one of 20 dodecahedral vertices, whereupon there are three 
ways for an orientation-preserving superposition of edges. Therefore, the proper 
group of the dodecahedron consists of 20-3 = 60 rotations: 6-4 = 24 rotations by 
angles 27k/5, 1 < k < 4 about axes joining the centers of opposite faces, 10-2 = 20 
rotations by angles +277/3 about axes joining opposite vertices, 15 rotations by 
angles 180° about axes joining the midpoints of opposite edges (see Fig. 12.5), and 
the identity. The complete group of the dodecahedron has cardinality 20-6 = 120. 
Besides the 60 rotations just listed, it contains their compositions with the central 
symmetry in the origin —Id : R? > R?, vb —v. 


Exercise 12.15 Show that the orders of the complete groups of the cube, octa- 
hedron, and icosahedron (Figs. 12.6, 12.7, and 12.8) are equal to 48, 48, and 120 
respectively. Check that the corresponding proper groups consist of 24, 24, and 60 
rotations and list them all. 


Fig. 12.5 Dodecahedron 
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Fig. 12.6 Icosahedron 


Fig. 12.7 Cube 


Fig. 12.8 Octahedron 
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12.4 Homomorphisms of Groups 


Homomorphisms of arbitrary groups share the key properties of the abelian group 
homomorphisms considered in Sect. 2.6 on p.31. Namely, a homomorphism 


g:Gi>G 
sends the unit e; € G; to the unit ey € Gp. This follows from the equalities 


p(er)p(er) = vlere1) = (ei) 


after multiplication of the left and right sides by y(e;)~!. Further, g(g7!) = g(g)! 
for every g € G, because 9(g~')g(g) = v(g~'g) = (e1) = eo. Therefore, the 
image img © g(G;) C G» of a group homomorphism is a subgroup in the target 
group. As for abelian groups, the preimage of the unit is called the kernel of the 
homomorphism and is denoted by 


def 


kerg £97! (er) = {g € G | ogi) =e} . 


The kernel is a subgroup of G; by Exercise 12.3, because for all g,h € kerg, we 
have 


o(gh') = g(g)p(h) | = ee =e. 


The set-theoretic structure of the other fibers is also the same as in the abelian case. 


Proposition 12.1 All nonempty fibers of a group homomorphism g : G; — G2 are 
in bijection with ker p. Namely, p—' (y(g)) = g+ (kerg) = (ker) - g for every 
g € Gi, where g- (ker g) = {gh | h € ker y} and (ker y) - g = {hg | h € ker@}. 


Proof If g(t) = (g), then g(tg') = g()g(g)! = e and g(g7') = 
v(g) 'p(t) = e. Hence, tg! € kerg and g 't € kerg. Therefore, t lies in 
both sets (kerg) - g and g- (kerg). Conversely, for all h € kerg, we have 
plhg) = v(h)p(g) = v(g) and g(gh) = ¢(g)p(h) = og). Hence, for every 
g € Gi, the fiber g~! (~(g)) coincides with both sets (ker g) - g and g - (ker g). This 
forces (ker g) - g = g- (ker@). Inverse bijections between g~! (v(g)) = g¢- (ker) 
and ker g are provided by the maps 


hegh 


—_—_ 
kergo sg: - (erg). 
go teat 
oO 


Corollary 12.1 A group homomorphism g : G1 — Go is injective if and only if 
kerg = {e}. Oo 
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Corollary 12.2 For every finite group G and group homomorphism g : G —> H, 
the equality 


|im(y)| = |G|/| ker(g)| 


holds. In particular, both | ker g| and | im @| divide |G]. Oo 


Example 12.7 (Sign Homomorphism) In Corollary 9.1 on p.209, we constructed 
the sign homomorphism sgn : S, — {+1} from the symmetric group to the 
multiplicative group of signs.'* Its kernel is the group of even permutations A, = 
ker sgn of order |A,,| = n!/2. 


Example 12.8 (Determinant and Finite Linear Groups) In Sect. 9.3.2 on p.214 we 
constructed the determinant homomorphism 


det: GL(V) >k*, Fr detF, (12.5) 


from the general linear group GL(V) of a vector space V to the multiplicative group 
k* of the ground field k. The kernel of the determinant homomorphism is the special 
linear group 


SL(V) = kerdet = {F :€ GL(V) | detF = 1}. 
If dim V = nandk = F, consists of g elements, the group GL(V) is finite of order 


|GL.€,)| = @" -D@"-9@"-7)--- @-a"), 


because the elements of GL(V) are in bijection with the bases of V. 

Exercise 12.16 Check this. 

Since the determinant homomorphism (12.5) is surjective, '* the special linear group 
has order |SL,(F,)| = |GLi(F,)|/|k*| = (¢" — 1)(4" — 9) ---(4" —4""')/(q— D. 


Example 12.9 (Homomorphism of Linear Group to Projective Group) In Sect. 11.5 
on p.270, we have seen that every linear automorphism F € GL(V) induces a 
bijective transformation F : P(V) — P(V). This gives a surjective homomorphism 


m:GL(V)  PGL(V), FRF. (12.6) 
By Theorem 11.1 on p. 270, its kernel ker 7 ~ k* consists of the scalar homotheties 


vt> Av, A €k*. The proportionality classes of volume-preserving operators form 
a subgroup called the special projective group and denoted by PSL(V) C PGL(V). 


'3]t is isomorphic to the additive group Z/(2) by taking 1t> 0, -1t 1. 
'4The diagonal matrix F with diagonal elements (A, 1, 1, ... , 1) has det F = A. 


12.4 Homomorphisms of Groups 291 


Restricting the surjection (12.6) to the subgroup SL(V) C GL(V), we get a 
surjective homomorphism 


nm’: SL(V) > PSL(V), Fr F, (12.7) 
with finite kernel isomorphic to the multiplicative group w,,(k) = {€ € k* | ¢" = 1} 


of nth roots of unity ink. 


Example 12.10 (Surjection S4 —» S3) In Example 11.9 on p.274 we attached a 
complete quadrangle abcd to a quadruple of points a, b,c, d € P2 such that no three 
of them are collinear. It is formed by three pairs of opposite edges (ab) and (cd), 
(ac) and (bd), (ad) and (bc) (see Fig. 12.9) crossing in a triple of points 


x =(ab)N (cd), y=(ac)N (bd), z= (ad)N (be), (12.8) 


Fig. 12.9 Quadrangle and 
associated triangle 


which form the associated triangle of the quadrangle abcd. Each permutation of the 
vertices a, b, c, d uniquely defines!» a projective linear automorphism of P2 sending 
the quadrangle to itself. We get an injective homomorphism S4 <> PGL3(k), whose 
image acts on the quadrangle abcd and on the associated triangle xyz permuting 
the vertices x, y, z in accordance with the incidences (12.8). For example, the 3- 
cycle (b,c,a,d) € S4 leads to the cyclic permutation (y, z,x); the transpositions 
(b, a, c,d), (a, c, b, d), (c, b, a, d) lead to the transpositions (x, z, y), (y, x, z), (Z, ¥, x) 
respectively. Therefore, we get a surjective homomorphism $4 —> $3. Its kernel has 
order 4!/ 3! = 4 and coincides with the Klein four group formed by the identity map 
and three pairs of transpositions in disjoint pairs: (b, a, d,c), (c,d, a, b), (d,c, b, a). 


'SSee Theorem 11.1 on p. 270. 
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Fig. 12.10 From cube to 
quadrangle 


Example 12.11 (Proper Group of Cube and S4) The proper group of the cube 
SOcube C SO3(R) acts on the four lines a, b, c, d joining opposite vertices and 
on the three lines x, y, z joining the centers of opposite faces (see Fig. 12.10). 
On the projective plane P; = P(R°), these seven lines become the vertices of the 
quadrangle abcd and associated triangle xyz (see Fig. 12.9). Rotation by 180° about 
an axis joining the midpoints of opposite edges of the cube swaps the two diagonals 
joining the endpoints of these edges and takes each of the two remaining diagonals 
to itself. Since all transpositions of diagonals are achieved by rotations of the cube, 
we have a surjective homomorphism 


SOcube > S4- (12.9) 


It is bijective, because both groups are of the same order 24. Under the isomor- 
phism (12.9), six rotations by +90° about lines x, y, z go to six 4-cycles of cyclic 
type TT], three rotations by 180° about the same lines go to three pairs of disjoint 
transpositions of cyclic type HH; eight rotations by +120° about the lines a, b, c, 


d go to eight 3-cycles of cyclic type HH, and six rotations by 180° about the axes 
joining the midpoints of opposite edges go to six simple transpositions of cyclic type 


. The homomorphism SOcube — 53 provided by the action of the group SOcube 


on the lines x, y, zis compatible with both the isomorphism (12.9) and the surjection 
S4 —> S3 from Example 12.10. Its kernel consists of the Euclidean isometries of R3 
sending each coordinate axis x, y, z to itself. Therefore, it coincides with the dihedral 
group D>. The isomorphism (12.9) identifies this dihedral group with the kernel of 
the surjection S4 —> $3 from Example 12.10. 
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Fig. 12.11 Cube on 
dodecahedron 


Example 12.12 (Proper Dodecahedral Group and As) Each diagonal of a regular 
pentagonal face of a dodecahedron is uniquely completed by appropriate diagonals 
of the other faces!® to a cube whose edges are the chosen diagonals as in Fig. 12.11. 
There are altogether five such cubes on the surface of the dodecahedron. They are 
in bijection with the five diagonals of some fixed face of the dodecahedron. The 
proper group of the dodecahedron permutes these five cubes. This provides us with a 
homomorphism Wao : SOgoa > Ss. Looking at the model of the dodecahedron,” it 
is easy to see that Waoa identifies 20-3 = 60 rotations of the dodecahedral group with 
60 even permutations of five cubes as follows: 6-4 = 24 rotations by angles 27k/5, 
1 < k < 4, about the axes through the centers of opposite faces go to 24 maximal 
cycles of cyclic type] [[[ J; 10-2 = 20 rotations by angles +27/3 about the axes 


through the opposite vertices go to 20 3-cycles of cyclic type ; 15 rotations 
by angles 180° about the axes through the midpoints of opposite edges go to 15 
pairs of disjoint transpositions of cyclic type HH. Thus, we get an isomorphism 
Waod : SOdoa > As. 

Exercise 12.17 Prove independently that ker Yaoa = {Id} and therefore Waoa is 
injective. 


In contrast with Example 12.5, if we pass from the proper dodecahedral group to the 
complete one, then we do not get more permutations of cubes, because the central 
symmetry of the dodecahedron acts trivially on the cubes. 


Exercise 12.18 Show that the groups $5; and Ogoa are not isomorphic. 


'6By taking exactly one diagonal in each face. 
T strongly recommend that you do Exercise 12.10 before reading further. 
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12.5 Group Actions 


12.5.1 Definitions and Terminology 


Recall that we write Aut(X) for the group of all bijections X = X. For a group G 
and set X, a group homomorphism ¢ : G — Aut(X) is called an action of the group 
G on the set X or a representation of G by the transformations of X. We write G : X 
if some action of G on X is given, and we write g, : X — X for the transformation 
y(g) : X — X corresponding to g € G under the representation g. The fact that 
the mapping g +> @g is a group homomorphism means that @g, = Gg ° > for all 
g,h € G. When the action is clear from context or does not matter, we shall write 
simply gx instead of g,(x). An action is called transitive if for every two points 
x,y € X, there exists a transformation g € G such that gx = y. More generally, 
an action is called m-transitive if every ordered collection of m distinct points of 
X can be sent to any other such collection by some transformation from the group 
G. An action is called free if every element g 4 e acts on X without fixed points, 
ie, VQ €GVxEX ex =x => g =e. Anactiong : G > AutX is called 
exact (or faithful) if kerg = e, i.e., if every g # e acts on X nonidentically. Every 
free action is clearly faithful. A faithful representation identifies a group G with 
some transformation group g(G) C Aut(X) of the set X. Usually, some geometric 
or algebraic structure on X respected by G stands behind such a representation. 


Example 12.13 (Regular Actions) Let X be the set of all elements of a group G 
and Aut(X) the group of set-theoretic bijections X + X knowing nothing about the 
group structure on G. The map 


A:G— AutX, gts (Agixb gx), (12.10) 


that sends g € G to the transformation!*® of X provided by the left multiplication by g 
is an action of G on X, because Agi, (x) = ghx = Ag (hx) = Ag (An(x)) = Ag eo An (X). 
It is called the left regular action of G on itself. Since the equality gh = hinG 
implies the equality g = e, the left regular action is free and therefore exact. Thus, 
every abstract group is isomorphic to some transformation group of an appropriate 
set. This remark is known as Cayley’s theorem. 

For example, the left regular representation realizes the additive group R as the 
group of translations A, : x ++ x + v of the real line. Similarly, the multiplicative 
group R* is realized as the group of homotheties A, : x +> cx of the punctured real 
line R ~ {0}. 

Symmetrically, a right regular action 9 : G — Aut(X) sends an element g € G 
to the right multiplication by g~!, ice., Og 1x b> xg7!. We use g | in order to satisfy 


'8Note that this transformation of X is not a group homomorphism from G to G, because in general 
g(hihz) # (ghi)(gha). 
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the condition Qg,2, = Qg,Qe,- The use of g would lead to an antihomomorphism 
G — Aut(X), which reverses the compositions. 


Exercise 12.19 Verify that a right regular action is free. 


Example 12.14 (Adjoint Action) The map Ad : G — Aut(G), g  Adg, sending 
g € Gto the conjugation-by-g automorphism 


Adp:G>G, ht ghg', (12.11) 


is called the adjoint action of G on itself. 


Exercise 12.20 Check that for each g € G, the conjugation map (12.11) is an 
invertible group homomorphism from G to G. Then verify that the assignment 


gt Adg 


defines a group homomorphism from G to Aut G. 


The image of the adjoint action is denoted by Int(G) & Ad(G) C Aut G and is called 
the group of inner automorphisms of the group G. The elements of the complement 
Aut G ~ Int(G) are called outer automorphisms of G. In contrast with the regular 
actions, the adjoint action is nether free nor exact in general. For example, for an 
abelian group G, every inner automorphism (12.11) is the identity, and the adjoint 
action is trivial. For an arbitrary group G, the kernel of the adjoint action consists 
of all g € G such that ghg~! = h for all h € G. The latter is equivalent to gh = hg 
and means that g commutes with all elements of the group. Such elements g form a 
subgroup of G called the center of G and denoted by 


Z(G) # ker(Ad) = {g € G| VhE G gh = hg}. (12.12) 


The set of all elements h € G remaining fixed under the conjugation map (12.11) 
consists of all elements commuting with g. It is called the centralizer of g and 
denoted by 


C, = {he G| hg = gh}. 


12.5.2. Orbits and Stabilizers 


Every group G acting on a set X provides X with a binary relation y ~ x, meaning 
that y = gx for some g ¢€ G. This relation is reflexive, because x = ex. It is 
symmetric, because y = gx => x = g'!y. It is transitive, because y = gx and 
z = hy force z = (hg)x. Therefore, we are dealing with an equivalence, which 
breaks X into a disjoint union of equivalence classes.!? The class of a point x € X 


'See Sect. 1.2 on p.7. 
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consists of all points that can be obtained from x by means of transformations from 
G. It is denoted by Gx = {gx | g € G} and called the orbit of x under the action of 
G. The set of all orbits is denoted by X/G and called the quotient of X by the action 
of G. Associated with every orbit Gx is the orbit map , 


ev; :G—» Gx, gtr gx, (12.13) 


which is a kind of evaluation. The fiber of this map over a point y € Gx consists 
of all transformations sending x to y. It is called the transporter from x to y and is 
denoted by G,, = {g € G| gx = y}. The fiber over x consists of all transformations 
sending x to itself. It is called the stabilizer of x in G and is denoted by 


Stabg(x) = Gy = {g € G| gx = x}, (12.14) 


or just Stab(x) when the reference to G is not important. 


Exercise 12.21 Check that Stabg(x) is a subgroup of G. 


Given y = gx and z = hx, then for every s € Stab(x), the transformation hsg™! is 


in G.,. Conversely, if fy = z, then h7'fg € Stab(x). Thus, there are two mutually 
inverse bijections 


sehsg! 


Stab(x) 
a fgaf (12.15) 


Hence, for any three points x, y, z lying in the same G-orbit, there is a bijection 
between the transporter G,, and stabilizer Stab(x). This simple remark leads to the 
following assertion. 


Proposition 12.2 (Orbit Length Formula) Let G be a finite transformation group 
of an arbitrary set and let x be any point of this set. Then 

|Gx| = |G| : |Stabg(x)|. 
In particular, the lengths of all orbits and orders of all stabilizers divide the order 
of the group. 


Proof The group G decomposes into the disjoint union of the fibers of the surjective 
orbit map (12.13). All the fibers have cardinality |Stab(x)|. Oo 


Proposition 12.3 The stabilizers of all points lying in the same orbit are conjugate: 
y = gx => Stab(y) = g Stab(x) g~! = {ghg™! | h € Stab(x)}. 


In particular, if one of them is finite, then they all are finite and have equal 
cardinalities. 


Proof Take z = yin the diagram (12.15). oO 


12.5 Group Actions 297 
Example 12.15 (Multinomial Coefficients Revisited) Fix an alphabet 
A= {a}, 2,..., ax} 


of cardinality k and write X for the set of all words of length n in this alphabet. 


Equivalently, X can be viewed as the set of all maps w: {1, 2, ... ,m} —- A. Each 
permutation o € S, acts on X by w b wo !. In terms of words, o permutes the 
letters of a word by the rule a, dy. ... Gy, Aw, 1) Aw-1g) +++ Wyle" 


Exercise 12.22 Check that this provides X with an action of S,. 


The S,,-orbit of a given word w € X consists of the words in which every letter a; € A 
appears the same number of times as in w. Therefore, the points of the quotient 
X/S, are naturally marked by the sequences m, m2, ..., ™,, where m; is the number 
of occurrences of the letter a; in each word of the orbit. The stabilizer Stab(w) of a 
word w in such an orbit consists of m,!-mz!---m,! independent permutations within 
the groups of coinciding letters. Therefore, the length of this orbit is 


|S,w| SS EE 
|Stab(w)| sm! - mz! +++ m,! mi «**Mg 


(compare with Example 1.2 on p.5). We see that different orbits have different 
lengths, and the orders of stabilizers in different orbits are also different. 


Exercise 12.23 For each Platonic solid ®, consider the natural action of Om on 
the set of (a) faces, (b) edges, (c) vertices of ®. Calculate |Og| by means of 
Proposition 12.2 applied to each of these actions. 


Example 12.16 (Conjugation Classes in the Symmetric Group) The orbits of the 
adjoint action Ad : G — Aut(G) are called conjugation classes in G. Such a 
class Ad(G)h = {ghg™! | g € G} consists of all elements conjugate to a given 
element h ¢€ G. Let us describe the conjugation classes in the symmetric group 
S,. For a permutation o = (0), 02,...,0n) € Sy, its conjugate permutation gog! 
sends the element g(i) to the element g(o;) for each i = 1,2,...,n. Therefore, the 
conjugation map Ad, : t +» gtg™! provided by g = (g1, 82,..-,8n) € Sy sends a 
cycle |ij,i2,..., ix) € S, to the cycle |g;,, gi,,..., Six) formed by the g-images of 
the elements from the original cycle. If o € S, has cyclic type A and disjoint cycles 
of o are written in the rows of the diagram A, then the action of Ad, just permutes the 
numbers in the cells of the diagram A by the rule i > g;. Therefore, the conjugation 
classes in S,, are in bijection with the Young diagrams A of weight”? |A| = n. The 
adjoint orbit corresponding to the diagram A consists of all permutations obtained 
as follows: fill the cells of A by the numbers 1, 2, ... , without repetitions and 
form the product of cycles recorded in the rows of diagram. The adjoint action of an 


20That is, consisting of n cells. 
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element g € S,, on such an orbit just permutes the numbers in the cells of the diagram 
as prescribed by g. Such a permutation centralizes the product of cycles if and only 
if it cyclically permutes the elements within the rows or arbitrarily permutes the 
rows of equal length in their entirety. Thus, the centralizer has order 


n 
zy SV my! 2 «mg! +n» my! = I] M00" , 
a=1 


where m; = m;(A) is the number of rows with length i in A. The cardinality of the 
conjugation class of a permutation of cyclic type A is equal to n!/z,. For example, 
the permutation 


o = (6,5, 4,7, 2, 1,9,8, 3) € So 


can be decomposed into disjoint cycles as |7, 9, 3, 4) |2, 5) |1, 6) |8) and corresponds 
to the filled Young diagram 


Its centralizer is isomorphic to Z/(4) x Z/(2) x Z/(2) x Z/(2), where the factors 
are generated by the cyclic shifts within the first three rows and the transposition of 
the second and third rows. It has cardinality 32. The conjugation class of g consists 
of 9!/32 = 11340 permutations. 


12.5.3, Enumeration of Orbits 


Given an action of a finite group G on a finite set X, the computation of the total 
number of orbits, that is, the cardinality of X/G, is met by an obvious difficulty: 
since orbits have different lengths, we have to use separate enumeration procedures 
for orbits of different types and, by the way, enumerate these types. The following 
claim avoids this problem quite elegantly. 


Theorem 12.2 (Burnside—Po6lya—Redfield Formula) Let a finite group G act ona 
finite set X. For each g € G, let X8 be the fixed-point set of the transformation g, i.e., 
X8 = {x EX | gx =x} = {x € X | g © Stab(x)}. Then |X/G| = |G|"! wie |X8|. 


Proof Write F C Gx X for the set of all pairs (g,x) such that gx = x. The 
projections F — X and F > G show that 


| |Stab@) =F =| | x*. 


xEX geG 
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The second equality leads to |F| = >> ec |X*|. The first equality implies that 
|F| = |G| - |X/G|, because the stabilizers of all points in any orbit Gx have the 
same cardinality |G|/|Gx|, and the sum of these cardinalities equals |G]. Oo 


Example 12.17 (Necklaces) Given an unrestricted supply of uniform beads of n 
distinct colors, how many different necklaces can be made using six beads? The 
answer is given by the number of orbits in the natural action of the dihedral 
group Dg on the set of colorings of the dihedral vertices in n given colors. The 
group De consists of 12 transformations: the identity e, two rotations t+! by 
angles +60°, two rotations t+” by angles +120°, the central symmetry t°, three 
reflections 014, 023, 036 in the main diagonals, and three reflections O14, 023, 036 
in the perpendicular bisectors of the sides. The identity fixes all n° colorings. 
The colorings fixed by the other transformations are shown in Fig. 12.12. Taking 
there all possible combinations of colors, we get n, n*, n°, n*, and n° colorings 


b n ° n > 
respectively. Thus, by Theorem 12.2, the number of six-bead necklaces equals 
(n° +3n'+4n342n? + 2n) /12. 
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Fig. 12.12 Symmetric necklaces from six beads 
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12.6 Factorization of Groups 


12.6.1 Cosets 


Each subgroup H C G provides G with two equivalence relations coming from the 
left and right regular actions of H on G. The left regular action A;, : g hg leads to 
the equivalence g; ~z g2, meaning that g; = hg for some h € H. It decomposes G 
into the disjoint union of orbits Hg = {hg | h € H}, called right cosets of H in G. 
The set of all right cosets of H is denoted by H\G. 


Exercise 12.24 Check that the following conditions on a subgroup H C G and 
elements g1, g2 € G are equivalent: (a) Hg; = Hg», (b) gigy' € H, (€) gog7' € H. 


The right regular action 0, : g +> gh! leads to the equivalence g; ~g g2, meaning 
that g: = goh for some h € H. It decomposes G into a disjoint union of orbits 
gH © {gh | h € H} called left cosets”! of H in G. The set of all left cosets of H is 
denoted by G/H. 

Since both left and right regular actions of H on G are free, all orbits in both 
actions have length |H|. Therefore, each of the two actions produces |G|/|H| orbits. 
This number is called the index of the subgroup H in the group G and is denoted by 
[G: H] = |G/A|. As a byproduct, we get the following theorem of Lagrange. 


Theorem 12.3 (Lagrange’s Theorem) The order of a finite group G is divisible 
by the order of every subgroup H C G. Oo 
Corollary 12.3. The order of each element of a finite group divides the order of the 
group. 

Proof The order of an element g is equal to the order of the cyclic subgroup (g) 
spanned by g. Oo 


12.6.2. Normal Subgroups 


A subgroup H C G is called normal” if all inner automorphisms of G map H to 


itself, i.e., gg! = H forall g € G. We write H < G for a normal subgroup H. 
Exercise 12.25 Assume that gHg~! C H forall g € G. Show that H 4G. 

The equality gHg~' = H is equivalent to the equality gH = Hg. Therefore, H <1 G 
if and only if the left cosets of H coincide with the right: gH = Hg for all g € G. 


Example 12.18 (Kernels of Homomorphisms) In Proposition 12.1 on p.289, we 
proved that g(kerg) = (ker@) g for every group homomorphism g : G; — Gp 


*1Or left shifts. 
22Or invariant. 
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and g € G,. Hence, kerg <1 G; is normal in G;. This can be seen as well from 
Exercise 12.25: for every h € kerg and g € G, we have 


o(ghg') = v(g)p(h)o(g) | = o(g)(g) | =e. 


Hence g ker(y)g~! C kerg for all g € G. 


Example 12.19 (V4 <1 S4) The Klein four group V4 C S4 formed by the identity 
permutation and three permutations of cyclic type | is normal, because it is the 
union of two conjugation classes in S4. At the same time, V4 is the kernel of the 
epimorphism S4 —> $3 from Example 12.10 on p. 291. 


Example 12.20 (Inner Automorphisms) The inner automorphisms of a group G 
form a normal subgroup Int(G) <i] Aut(G), because for an inner automorphism 
Ad, : h +> ghg™' and an arbitrary automorphism g : G > G, the conjugate 
automorphism g e Ad, e g~! = Adycgy is inner. 


Exercise 12.26 Check the latter equality. 


Example 12.21 (Affine Group) Recall”> that associated with every vector space V 
is the group of affine automorphisms Aff(V) of the affine space A(V) over V. It 
follows from Proposition 6.6 and Proposition 6.7 on p. 148 that the differentiation 
map 


D: Aff(V) > GL(V), 9g Do, (12.16) 


which takes an affine map to its differential, is a group homomorphism. By 
Proposition 6.7, its kernel coincides with the subgroup of shift transformations 
Ty 1p +> p+ uv. Thus, the parallel shifts form a normal subgroup of Aff(V). This 
subgroup is isomorphic to the additive group of the vector space V. 


Exercise 12.27 Verify that gt,g~! = Tp,(v) for every v € V and g € Aff(V). 


Note that the differentiation map (12.16) is surjective, because for every F € GL(V) 
and p € A(V), the map F, : A(V) > A(W), gt p+ F(pQ), is affine, bijective, 
and has Dr, = F. 


12.6.3 Quotient Groups 


Given a group G and a subgroup H C G, an attempt to define a group structure on 
the set of left cosets G/H by means of our usual formula 


(g1A) - (g2H) = (gig2)H (12.17) 


3See Sect. 6.5.6 on p. 148. 
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fails, because equal cosets g1H = fiH and goH = foH may produce (gig2)H # 
(fifa). 


Exercise 12.28 Let G = S3 and H = (52), where sj. = (2,1,3). Explicitly 
indicate some cosets for which definition (12.17) is incorrect. 


Proposition 12.4 The group structure on G/H is well defined by (12.17) if and only 
ifH 1G. 


Proof Let formula (12.17) provide G/H with a well-defined binary operation. Then 
this operation is associative, because (gjH - goH)- 93H = (gi1g92)H- g3H = 
((gig2)g3)H = (g1(g2g3))H = g1H - (g293)H = giH - (gH - g3H). It has the 
unit eH = H. Every coset gH has an inverse coset g-!'H. Hence G/H is a group, 
and the quotient map G —> G/H, g +> gH, is a group homomorphism with kernel 
H. Therefore, H <1 G by Example 12.18. 

Conversely, let H <1 G be normal. For subsets A, B € G, we put 


AB ® {ab|aéA, be B}. 


For example, HH = H. This notation agrees with the notation gH for the left coset, 
because the latter consists of all gh with h € H. Since the product (g1H)(g2H) = 
{ab | a € giH, b © goH} depends only on the cosets, it is enough to check that 
(g1H)(g2H) = g1g2H. This forces the latter coset to be independent of the choice 
of g1 € giH, go © goH. Since H is normal, gH = Hg for all g € G. We use this 
twice for g = g; and g = gig to get (g1H)(g2H) = HgigoH = gigoHH = gig0H, 
as required. Oo 


Definition 12.2 (Quotient Group) For anormal subgroup H <i G, the set of cosets 
G/H equipped with the group structure 


gH: mH A {sr |s € gH, r € gH} = (¢1g.)H 


is called the quotient (or factor) group of G by H. The map G —> G/H, g + gH, 
is called a quotient homomorphism. 


Corollary 12.4 (Decomposition of Homomorphisms) <A group homomorphism 
g : G, — G2 can be factored as the composition of a quotient epimorphism 


G, — G/kerg 
followed by an injective homomorphism 
Gi/ ker g > G» 


sending the coset gkergy € G,/kerg to 9(g) € G2. In particular, img ~ G/ ker@. 


Proof The corollary states that g~'(y(g)) = gkerg for every g(g) € img. We 
have already seen this in Proposition 12.1 on p. 289. oO 
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Proposition 12.5 For two subgroups N,H C G such that N <1 G, the product 
HN © {hx | h € H, x € N} is a subgroup of G. Moreover, HON A H,N < AN, 
and HN/N ~ H/(HON). 


Proof The product HN C G is a subgroup, because for all 41,/42,h € H and all 
X1,X2,x € N, we have 


hyxihoxz = (hyhz) (hy'xihz +22) € HN, 
(12.18) 


(hx) xh"! = hh (heh!) € AN, 


since hy!x,hy € N and hxh™! € N. The subgroup HM N < H is normal, because 
N <i Gis normal. Now consider the surjective map g : HN > H/(H 1 N) sending 
the product hx to the coset h- (HM N). It is well defined, because the equality 
yx, = hx» implies that hy'hy = xyxy'! € HON, and therefore h, -(H MN) = 
hy - (hy'hy) - (HON) = In - (HNN). It follows from (12.18) that g is a group 
homomorphism. Since kerg = eN = N, we conclude that H/(H NN) = img ~ 
HN/kerg = HN/N by Corollary 12.4. Oo 


Exercise 12.29 Let g : G; —» G; be a surjective group homomorphism. Show that 
for every normal subgroup N <1 Go, its preimage N; = g—!(N>) is normal in G,, 
and G/N, and G2/N>. 


Problems for Independent Solution to Chap. 12 


Problem 12.1 Show that an associative binary operation G x G — G provides G 
with a group structure if and only if for all a,b € G, the equations ax = b and 
ya = b have unique solutions x, y € G. 

Problem 12.2 Enumerate all the subgroups in the dihedral groups** D4 and De. 

Problem 12.3 Show that every subgroup of a cyclic group is cyclic. 

Problem 12.4 Find the parity of the order of an arbitrary odd permutation. 

Problem 12.5In the permutation group Ss, calculate (a) g! for g = 
(3, 5, 4, 1, 2), (b) the number of elements remaining fixed under conjugation by 
the permutation (3, 5, 1, 2, 4). 

Problem 12.6 (Involutive Permutations) A permutation o is called involutive if 
o” = Id. Prove that o is involutive if and only if the Young diagram depicting the 
cyclic type of o has at most two columns. Show that every cycle of length > 3 in 
S, 1s a composition of two involutive permutations. 


4See Example 12.4 on p. 284. 
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Problem 12.7 (N.N. Konstantinov) The residents of town N may exchange their 
houses. However, only simple two-way exchanges” are allowed, and each 
homeowner can make at most one exchange per day. Can an arbitrary complex 
exchange be made in two days? 


Problem 12.8 Is it possible to swap the “1 and “2” tiles in the 15-puzzle following 
the game of fifteen rules? (All the other tiles should return to their initial 
positions.)”° 


Problem 12.9 (Orders of Elements) For an arbitrary group: 


(a) Show that every element of odd order is the square of some element of the 
group. 

(b) Find ord(fg) if ord(gf) = n. 

(c) Prove that ord(g”) = ord(g)/GCD(n, ord(g)) for alln € N. 

(d) For fg = gf, prove that ord(fg) | LcM(ord(f), ord(g)). 


Problem 12.10 Let ord(g) = 2 for all g 4 e in some group G. Show that G is 
abelian and describe all such finite groups. 


Problem 12.11 Let finite groups G, H for each k € N have the same number 
of order-k elements. Is it true that G ~ H? For the continuation of this story, 
see Problem 14.27 on p. 359. 


Problem 12.12 We say that a group G is spanned by a set B C Gif every element of 
G is a finite composition of elements of B (possibly repeated). (a) Is S,, spanned 
by the transpositions |1, 2) and cycle |1, 2,3, ... ,2)? (b) Is A, spanned by cycles 
11,2,3),|1,2,4), ..., |1,2,n)? 

Problem 12.13 Show that every finite group spanned by two nontrivial involutions” 
is isomorphic to a dihedral group. 


7 


Problem 12.14 Find the orders of the complete and proper groups of the standard 
(a) n-cube,”® (b) n-cocube,”? (c) n-simplex.*” To begin with, consider n = 4. 

Problem 12.15 (Group Qs) Let us equip the set Og © {-te, +i, +j, +k} with a 
multiplication such that e is a neutral element, signs are multiplied by the standard 
rules?! 2? = 7? = kh = -e, andij = —ji =k, jk = -kj =i, ki = -ik =j. 
Check that this is a group structure and find out whether Qs is isomorphic to D4. 


>5Such that A takes B’s and B takes A’s house; any more complicated combination (e.g., A takes 
B’s house, B takes C’s house, and C takes A’s house) is forbidden. 


?6See https://en.wikipedia.org/wiki/15_puzzle. 

27That is, elements o # e such that 0? = e. 

8See Problem 10.7 on p. 248. 

°See Problem 10.13 on p. 250. 

3°See Problem 10.9 on p. 249. 

VAsinR: +-+=—--=4,+--=--+=-. 
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Problem 12.16 (Direct Product of Groups) For a set of groups {G,}xex, verify 
that their set-theoretic direct product [][,<yG: equipped with componentwise 
composition 


(gx) xex (Ax) vex a (gxhx)xex 


is a group. Given two subgroups F’, H C G, show that G ~ F x Gif and only if 
the following three conditions hold simultaneously: 


(1) FOH={e}, 2) FH=G, (QB)VfEFVhAEH, fh=hf. 


Problem 12.17 For which n,m € N do we have Dinn ~ Dy x Z/(n)? 


Problem 12.18 Find all pairs of isomorphic groups in the following collections of 
groups: 


(a) Dg, Ds x Z/(2), Os x Z/(2); 
(b) S4,Di2, DeXZ/(2), D3xZ/(2)xZ/(2), D3xZ/(4), Osx Z/(3) ,DaxZ/(3). 


Problem 12.19 List all Platonic solids*” ® such that Op ~ SO@ x Z/(2). 


Problem 12.20 For each platonic solid ®, find the lengths of all the orbits for the 
tautological actions of the groups Os and SO@ on ®. Indicate all points whose 
orbits are shorter then a generic orbit. 


Problem 12.21 (Diagonal Actions) Let a group G act on sets X), X2,...,Xm and 
define the diagonal action of G on X, x Xz X +--+ X Xm by 


2 (X1,X2,--- Xm) > (9x1, BX2,---, BXm)- 


Write V (respectively £) for the set of vertices (respectively edges) of the standard 
cube in R*. The proper group of the cube SOcube acts tautologically on V and 
E. Describe the orbits of the diagonal action of SOgype on (a)V x V, (b)V x E, 
(DEXEXE. 

Problem 12.22 For the standard action of the permutation group S$, on X = 
{1, 2, ... , nm}, describe the orbits of the diagonal action of S,, on X” for alln > m. 
(To begin with, take m = 2, 3, ...) 


Problem 12.23 A finite group acts transitively on a set of cardinality at least 2. Show 
that there is some element in the group acting without fixed points. 


Problem 12.24 Given an unrestricted supply of uniform beads of n distinct colors, 
how many different necklaces can be made from (a) four, (b) seven, (c) eight, 
(d) nine beads? 


32See Exercise 12.10 on p. 283. 
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Problem 12.25 Given an unrestricted supply of uniform pieces of string of n distinct 
colors, how many distinct** string linkage patterns of the form 


(a) (b) 


can be made from them? (The pieces of string are tied together at the endpoints 
only.) 

Problem 12.26 Find the index of the subgroup Int(A5) in Aut (As). 

Problem 12.27* Find an outer automorphism™ of S¢. 


Problem 12.28 The product of two left cosets*> of a subgroup H is always a left 
coset of H. Show that H is normal. 


Problem 12.29 Show that for every pair of normal subgroups N, H with NH = 
{e}, we have nh = hn foralln € N,he H. 


Problem 12.30 Is there a nonabelian group G such that every subgroup of G is 
normal? 


Problem 12.31 (Commutator Subgroup) The smallest subgroup of G containing 
all commutators [g,h] = ghg'h7!, g,h € G, is called the commutator subgroup 
of G and is denoted by [G, G]. Show that (a) [G,G] <1) G is normal and the 
quotient group G/[G, G] is abelian. (b) [G, G] is contained in the kernel of every 
homomorphism from G to an arbitrary abelian group. 


Problem 12.32 Find the order |PSL, (Fy) F 
Problem 12.33 Construct isomorphisms of the following groups: 


(a) PSL2(F3) + Ag, 
(b) PGL2(F4) > As, 
(c) PSL2(Fs) > As, 
(d) PSL3(F2) + PSL2(F7), 
(e) PSL2(Fo) > Ao. 


Problem 12.34 (Steiner Systems and Mathieu Groups) A collection S of subsets 
of cardinality k in a finite set X of cardinality n is called a Steiner system S(t, k,n) 


33That is, nonsuperposable in R?. 

34 ‘wistydiowouroy dnoss oyeridoidde ue Aq 
Jot}0 ot} 0} suo deur 0} AN pue sattfeurpseo yenba jo sasse[o uoTeSnfuo0s JoloyfIp OM) PUL ‘JUTE 
35That is, the set of all products xy, where x and y run through two left cosets. 
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if every subset of cardinality t in X is contained in exactly one set from the 
collection S. For such a collection S, the group 


Aut S © {g € Aut(X) | VY € S 9(Y) € S} 


is called the automorphism group of S. 


(a) Given a Steiner system S(t, k,n), construct the Steiner system S(t — 1,k — 1, 
n—1). 

(b) Use the projective geometry over the finite field F, to construct the Steiner 
systems S(2,q,q7) and S(2,q + 1,q7 + q+ 1) for all g = p* with prime 
peEN. 

(c) Show that the Steiner system S(5, 6, 12) for 


X= Pi (Fi) = {[0], [1],..., [10], oo} 
is formed by the PGL2(F 1; )-orbits of all squares [0], [1], [4], [9], [3], [5] in Fir. 


(d*) Construct the Steiner system $(5, 8, 24). 
(e*) Find the orders of the Mathieu groups*® 


Myo # AutS(3, 4, 10), Mas BUCS (S22), 
Mi # AutS(4, 5, 11), Mp3 © Aut S(4, 7, 23), 
Mj = Aut S(5, 6, 12), Mo4 = AutS(5, 8,24). 


(f) Show that the Mathieu groups Mj,, Mz, M23 appear as stabilizers of some 
points under the tautological actions of the Mathieu groups M12, M23, M4 on 
their Steiner systems. 

(g) Construct an isomorphism PGL3(F4) = M2, & Aut S(2, 5,21). 

(h*) Construct an isomorphism Ag > [Mjo, Mio]. 
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Chapter 13 
Descriptions of Groups 


13.1 Generators and Relations 


13.1.1 Free Groups 


Associated with a set X is the free group Fy spanned by X and described as follows. 
Consider an alphabet formed by letters x and x~', where x € X. On the set of all 
words! of this alphabet consider the smallest equivalence relation “=” that identifies 
two words obtained from each other by inserting or deleting any number of copies of 
xx! or x~!x (or both) at the beginning, or at the end, or between any two sequential 
letters. By definition, the elements of the free group Fy are the equivalence classes 
of words with respect to this equivalence. The composition is the concatenation of 


words: x1X2 ... Xk iyo --- Vn xix... XEV1Y2 «++ Vn- 
Exercise 13.1 Verify that composition is well defined on the equivalence classes. 


The class of the empty word is the unit of Fy. Inversion swaps the letters x, x~! and 


reverses the order of the letters: («1x2 ... Xm)~! = x,,!...x3 1x71. We say that a 
word is irreducible if it does not contain any fragments xx~! or x~!x. 


Exercise 13.2 Check that there is exactly one irreducible word in each equivalence 
class and that it is the shortest word of the class. 


The elements of X are called the generators of the free group Fy. The free group 
with k generators is denoted by F;. The group F, ~ Z is the cyclic group of infinite 
order. The group F> is formed by classes of words written with four letters x, y, x, 
y7!. It is already quite vast. 


Exercise 13.3. Construct an injective group homomorphism Fy > F>. 


Including the empty word S. 
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Proposition 13.1 (Universal Property of Free Groups) The map i: X —> Fy 
that sends x € X to the class of the word x possesses the following universal 
property: for every group G and every map of sets T : X — G, there exists 


a unique group homomorphism evr : Fy — G such that T = evr oi. It takes 
the class of the word peo .. Xn", where x» € X and &, = +1, to the product 


T(x) T (x2) «++ (Gm) € G. If some group F’ and some map i’ : X — F' 
also have this universal property, then there exists a unique group isomorphism 
go: Fy » F’ such that i’ = oi. 


Proof The homomorphism evry is unique, because it has to act by the rule described 
in the proposition. At the same time, this rule certainly produces a well-defined 
homomorphism evr : Fy — G such that [ = evp oi. Thus, the mapi: X — Fy has 
the required universal property. Let i’ : X — F" be another map such that for every 
map T : X > G, there exists a unique homomorphism ev; : F’ > G extending I’. 
Then there are unique homomorphisms y = evy : Fy > F’ and y = ev; : F’ > Fy 
that can be fitted into the commutative diagrams 


i 


< 
--- > 


a - 


The second diagram shows that i = w@ oiand i’ = yy oi’. These equalities imply 
that yy = Id-, and y = Id;’, because the factorizations i = ev; oi, i’ = ev’, oi’ 
hold for unique ev; : Fy > Fx, ev), : F’ > F’, and we know that ev; = Idy, and 
ev, = Idy produce for such factorizations. oO 


13.1.2. Presentation of a Group by Generators and Relators 


Let G be an arbitrary group and X any set allowing an injective map to G. 
By Proposition 13.1, every inclusion  : X <> G can be uniquely extended to a 
homomorphism evp : Fy — G, x t I(x). If evr is surjective, then the subset 
T'(X) C Gis called a generating set for G, and the elements g, = I(x) € G, 
indexed by x € X, are called generators of the group G. In this case, G is exhausted 
by finite products of the form gj! g;° --- g;‘, where g; € '(X) and ¢; = +1. A group 
G is called finitely generated if it admits a finite generating set. For an inclusion 
IT: X & G that produces the surjective homomorphism 


evp : Fy > G, (13.1) 
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the kernel kerevp <1 Fy is called the group of relations among the generators 
g, = I(x). The subset R C kerevry is called a relating set, and its elements 
are called relators, if kerevy is the minimal normal subgroup in Fy containing 
R with respect to inclusions.” This means that every relation w € kerevp can be 
assembled from some finite collection of words in R by means of a finite number 
of compositions, inversions, and conjugations by arbitrary elements of Fy. In other 
words, each relator x}'x5> --- x2" € R produces the identity g¢! g? --- gi" = einG, 
and all constraints on the generators g, in G can be deduced from these identities by 
multiplying two identities side by side, reversing both sides of an identity, and mul- 
tiplying both sides of an identity by an element of G from either the left or the right. 
Every group G can be generated by some I'(X) C G, at worst for X = G. If some 
epimorphism (13.1) is given and some relating set R C Fy is known, then the pair 
(X, R) is called a presentation of G by generators and relators. Every presentation 
determines the group G uniquely up to isomorphism, because 


G~ Fy/Nr, 


where Nr <I Fy is the smallest normal subgroup containing R. A group G is called 
finitely presented if it admits a presentation (X, R) where both sets X, R are finite. A 
happy choice of presentation (e.g., with small X, R and nice relators) sometimes can 
appreciably clarify the structure of a group or its action somewhere. However, in 
general, the description of a group by generators and relators may be quite obscure. 
Even elucidation as to whether the group is trivial may be extremely difficult. In the 
formal sense accepted in mathematical logic, the latter problem is undecidable even 
in the class of finitely presented groups.* 


Proposition 13.2 Let a group G be generated by elements {gy}xex with relating 
set R C Fy. Then for every group H and every family of elements {hy}xex C H, 
there exists at most one homomorphism yw : G — H such that W (gy) = hy for 
all x € X. It exists if and only if for every relator xx, +--+ xim € R, the equality 
hihZ +++ hyn = e holds in H. In this case, iy is well defined by 


xX, x2 
(gig en) SG (13.2) 
Proof The family {hy},.ex C H is the same as the map ® : X > H,x b hy. 
By Proposition 13.1, such maps are in bijection with the group homomorphisms 


evo : Fx — H. Such a homomorphism is factorized through the quotient group 
G=F. Xx / N, R as 


i = 


NA 


> That is, the intersection of all normal subgroups containing R. 
>This is a part of the famous undecidability of the word problem proved by Pyotr Novikov in 1955. 
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if and only if Nr C kerevg. Since Nr <1 Fy is the smallest normal subgroup 
containing R and kereve <1 Fy is normal, the inclusion Nr C_ kerev@ is 
equivalent to the inclusion R C kerevg. If it holds, then by Proposition 13.1, the 
homomorphism w has to be defined by formula (13.2). oO 


13.1.3 Presentations for the Dihedral Groups 


Let us show that the dihedral group D,, can be presented by the generating set X = 
{x1, x2} and relating set R = ee (x1x2)"}, that is, it can be generated by two 
elements o;,02 € D, such that 


o =e, = Gia) =e (13.3) 


and all the relations between o; and o2 follow from (13.3). The reflection lines cut 
the regular n-gon into 2n right triangles (see Fig. 13.1). 


Fig. 13.1 Reflections 
generating the dihedral group 


Choose one of them and label it e € D,. Since every isometry of the plane is 
uniquely determined by its action on this triangle, 2 transformations g € Dy, 
are in bijection with our 2n triangles. Let us label triangle g(e) by g. Write 
£,, £2 for the reflection lines that pass through the sides of triangle e, and let 
01,02 € D,, be the reflections in these lines. Then the triangles obtained from e 
by sequential counterclockwise reflections in the sides become labeled 02, 0201 , 
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020102, 0201020, ,... and the triangles obtained by clockwise reflections become 
labeled by 01 , 0102, 01020] , 01020102,... 


Exercise 13.4 Let o, : R? — R? be reflection in the line 2. Check that for every 
isometry F : R* > R?, we have F 0 07 o F~! = or, or equivalently, of) 0 F = 
Foo. 


The composition 0; © 02 is a rotation in the direction from £, to £; by the doubled 
angle between £2 and £;, which equals 22/n for D,. Therefore, OT = a = 
(0\02)" = e. By Proposition 13.2, the assignment x; > 01, x2 b> 02 gives a well- 
defined epimorphism evg, ¢, : F2/Nr —> Dn, where Nr <1 F> is the smallest normal 
subgroup containing R = {x;, x5, (x1x2)"}. It remains to verify that it is injective, 
that is, that two words from F2 sent to the same g € D, are always congruent 
modulo R. Each word constructed from the alphabet {x,, x2} is congruent modulo R 
to a word of length at most 2n — 1 looking either like x;x2x; ... or like x2x1x2.... 
We have seen above that all words of the first type are sent by ev,, 4, to different 
elements of D,, labeling different triangles. The same is true for all words of the 
second type. Two words of different types go to the same element of D,, if and 
only if the corresponding words 01020, ... and 020102... label the same triangle 
g in Fig. 13.1, that is, if they encode two different ways of rolling triangle e into 
g: either counterclockwise or clockwise. In this case, the coincidence is written as 


CVo},09 (x12 sibs ) = CVo1,05 (x2x1x2 face . But the relation xjx2x,... = XoX1xX2... 
———— ———— ‘ere sameeemeree! ———— 
k 2n—k k 2n—k 


certainly follows from the relations xj = x5 = (x,x2)" =e. 


Exercise 13.5 Check this. 


13.1.4 Presentations of the Groups of Platonic Solids 


Let ® be one of the Platonic solids with triangular faces, that is, the tetrahedron, 
octahedron, or icosahedron. Let us put 

m, = the number of faces meeting at each vertex of ®, 

my = the number of faces meeting at each edge of ® = 2, 


m3 = 3. 
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The reflection planes of ® produce the barycentric subdivision of each face into six 
right triangles,* 2m, of which meet at the vertices of ©, 2m» at the midpoints of the 
edges of ®, and 2m; at the centers of that faces of ®. The total number of triangles 
equals N = 6- (number of faces). Let us collect these combinatorial data in a table: 


® m, mm, N 
tetrahedron| 3 3 2 24 
octahedron| 4 3 2 48 
icosahedron} 5 3 2 120 


Note that a regular n-gon could be included in this table with values m; = 2, mz = 
2,m3 =n, N = 4nif we agree that it has two faces exchanged by reflection in the 
plane of the n-gon. 

The intersections of the mirror planes of ® with its circumscribed sphere 
triangulate this sphere by N congruent triangles with angles 1/m,, 1/m2, 1/m3 
equal to the angles between the reflection planes of ®. In the tetrahedral case, the 
stereographic projection of this triangulation onto a plane can be seen in Fig. 13.2 
on p.315. Let us label one triangle of the triangulation with e € O@ and write 7, 
2, 3 for the reflection planes that cut it out. We number the planes is such a way 
that the angle between 7; and x; equals 1/m,. Since every isometry of R* sending 
® to itself preserves the center of ®, such an isometry is uniquely determined by 
its action on the triangle e. Therefore, the transformations g € O@ are in bijection 
with the triangles” of the triangulation. Let us label triangle g(e) with the element 
g € Oo. Then every transformation h € O@ sends each triangle g to the triangle 
hg. Write 0; € Oo, i = 1,2, 3, for the reflections in the planes z;. The composition 
0; © oj is a rotation about the line 7; M z; in the direction from z; to 7; by the angle 
2/m,, the doubled angle between x; and z;. Therefore, the reflections 0; satisfy 
the six relations 


of =e and (o,0;)" =e, (13.4) 


where i = 1,2, 3 and (i,j, k) runs through three cyclic permutations of (1, 2, 3). 


4With vertices at the barycenter of the face, at the barycenters of its edges, and at its vertices. 


5Note that we get a new explanation for the identity |O» | = N = 6- (number of faces). 
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Fig. 13.2 Triangulation of the sphere by the reflection planes of the tetrahedron [0, 1,2, 3] 
(stereographic projection from the point opposite vertex [0] onto the equatorial plane parallel to 
the face [1, 2, 3]) 


Proposition 13.3. The complete group Oe of a Platonic solid ® with triangular 
faces is presented by generators {x,, x2, x3} and relators 

x; and (x;xj)"*. (13.5) 
Proof The previous discussion shows that the evaluation x; +» o; produces a 
well-defined homomorphism g : F3/N — Oo, where N <1 F3 is the smallest 


normal subgroup containing the six words (13.5). It takes the class of the word 
W = Xj,Xj,..-X;,,, Where i, € {1,2, 3}, to 


& = Y(W) = 03,0; °++0;,, € Oo « 


When we read the sequence of reflection 0;,0;,---0;,, from left to right,° the 
triangles labeled by elements 


8v = 05, Oi, “Oj (13.6) 


vy 


®That is, in the opposite order to that in which the reflections are made. 
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form a continuous ribbon go, g1, ...,8m, Which starts at g9 = e and ends at g,, = g. 
Every triangle g, of this ribbon is obtained from the preceding triangle g,_; by 
reflection in the plane g,_;(z;,). Under the superposition of triangle e with triangle 
&y-1 provided by the transformation g,_1, this reflection plane matches the side of 
e that is cut out by z;,. Indeed, by Exercise 13.4 on p.313, reflection in the plane 
&v—1(7i,) coincides with the composition 8v-103, By € Oo, which maps triangle 
gv—1 to triangle gy-19;, 8518-1 = 8v-19i, = Bv- 

Therefore, if we label the sides of triangle e by 1,2,3 in accordance with the 
numbers of planes that cut them out of the sphere and then extend this marking 
to all triangles in such a way that congruent sides receive identical labels, then the 
reflection g,—; +> gy is made through the i, th side, whose mark i, € {1, 2, 3} equals 
the vth index 7,, in the sequence (13.6). 

Thus, to write a sequence of numbers i), i2, ..., im € {0, 1,2} producing a ribbon 
of triangles (13.6) that goes from e to any prescribed triangle g, we proceed as 
follows. Connect some internal points of triangles e, g by a smooth curve’ that does 
not pass through the vertices of the triangulation and crosses all sides transversally. 
Then go along this curve from e to g and write down the labels on the sequential 
edges that we cross. We obtain thereby a sequence of labels i,,i,, ... i,,, such that 
& = 04,0), °**0;,, = P(Xi, Xi, ++ Xi,,). Figure 13.3 shows how this works. Red, green, 
and yellow are used there instead of labels 1, 2, 3 respectively. In particular, we see 
that g : F;/N —» Oa is surjective. Let us check now that it is injective, i.e., any 
two words w1, W2 from the alphabet {x), x2,x3} mapped to the same transformation 
g € Oe are equivalent modulo relators (13.5). Each of these words produces a 
ribbon of triangles beginning at e and ending at g. In these ribbons, the vth triangle 
is obtained from the preceding, (v — 1)th, triangle by reflection in the side marked 
by the same index i, € {1, 2, 3} as the vth letter x;, in the corresponding word w; or 
wW2. Let us draw a smooth curve as above within each of the two ribbons and deform 
the second curve into the first across the surface of the sphere in such a way that at 
each moment, the deformed curve remains transversal to all the sides it crosses. 
Each time the curve meets some vertex v common to 2m, triangles, the ribbon 
of triangles that corresponds to the curve is changed as follows. A sequence of £ 
sequential triangles covering a part of the regular m,-gon centered at v is replaced 
by the complementary sequence of 2m, — £ triangles covering the remaining part 
of the m,-gon and proceeding around v in the opposite direction. On the side of 
the words, this corresponds to replacement of some ¢-letter fragment looking like 


XjXjXiXjX; ... DY a (2m, — £)-letter fragment looking like x;x;x;x;x;.... Since these 
two fragments are equal modulo relations x7 = x7 = (x:x;)"" = e, the class of the 
word in F'3/Nr remains unchanged. Oo 


For instance by a geodesic cut out of the sphere by a plane passing through the center of the 
sphere. 
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= 


Fig. 13.3) 122%3X2X 3X |X3X | XQX3XoN |X3X XQ = QA HN AZNQM |X ZXA3ZXQN3X |X3X2 


For example, the upper and lower trajectories leading from e to g in Fig. 13.3 
produce the words 


NM XIXZXIN ZN |X 3ZN | XINZNIN | X3HK 1X2 
and 
XIN 1 X3ZXIWN {NX 3XINZXIN 3X 1 X3X2 
transformed into each other by means of the cyclic relations 
X{X2 = X2X1, XZX{XZX] = Xj X3, XZX{XZ = Xj X3X] 


applied at labeled vertices. 


Exercise 13.6 Chose an internal point a in triangle e and some point b not opposite 
to a within triangle g such that the shortest of the two geodesics® joining a and b 
does not pass through the vertices of the triangulation.? Use this geodesic to write a 
word w € F3 such that g(w) = g as above. Show that the length of this word does 
not depend on the agreed-upon choice of points and that g cannot be represented by 
a shorter word. 


8That is, arcs of a great circle cut out of the sphere by the plane passing through a, b, and the center 
of sphere. 

°This always can be achieved by a small perturbation of b, because there are finitely many 
geodesics passing through a and some vertex of the triangulation. 
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13.2 Presentation of the Symmetric Group 


Consider the symmetric group S,4+; = Aut{0, 1, ... , n} and write oj = |(i— 1), 7) 
for the transposition of sequential elements (i— 1), i, where 1 <i <n. 


Exercise 13.7 Check that transpositions 0; generate S,41 and satisfy the relations 


O; =e, OjOj+10; = Oj410;0j41, O70; = Oj0; for |i—j| > 2. 


Thus, we have a surjective group homomorphism 
Q > F,/N —> Sn+i, Xi b> Oj, (13.7) 
where N <1 F,, is the smallest normal subgroup containing the words 


x?, (xixigs)® and (xjxj)” for |i—j] > 2. (13.8) 
Below we will give two proofs that @ is injective. The geometric approach 
from Sect. 13.2.1 will repeat the arguments used in the proof of Proposition 13.3 
for the tetrahedral case, but instead of the tetrahedron, the standard n-simplex in 
R"t! will be considered. In Sect. 13.2.2, geometric arguments will be translated 
into combinatorial language for the sake of those who prefer a pedantic transfer of 
letters from one side of an equation to the other rather than imagining n-dimensional 
pictures. 


13.2.1 Complete Group of a Regular Simplex 


Write A” for R” sitting within R”*! as an affine hyperplane drawn through the heads 
of all standard basis vectors!® e9,€,, ...,@, € R"*!. The convex hull of the points 
€0,€1, .--,€, € A” is called the standard n-simplex'! and denoted by A. This is 
a regular polyhedron with center c = ((n + 1)7', (n+ 1)7', ..., (n+ 1)7'). We 
vectorize the affine space A” using c as the origin. To prevent many-storied indices, 
we denote the vertices of A just by numbers i € {0, 1, ..., m}. Forn = 3, the 
simplex A is the regular tetrahedron [0, 1,2, 3] C IR? considered in Sect. 13.1.4. 


Exercise 13.8 Given two ordered collections of (n + 1) points in A” such that 
no hyperplane whatever contains any of them, prove that there exists a unique 


‘Equivalently, the hyperplane A” C R”T! is given by the equation x) +x, + +++ +x, = 1. 
See Problem 10.9 on p. 249. 
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affine map!* A” —> A” sending one collection to the other. Show that this map 
is automatically invertible. 


The exercise implies that elements of the complete group O, are in a natural 
bijection with the permutations of vertices [0, 1, ..., n]. If g € S,+) is such a 
permutation, we use the same letter g to denote the linear isometry A > A sending 
vertex i to vertex g; = g(i) for alli. 

The simplex A has n(n + 1)/2 reflection hyperplanes 2 passing through the 
midpoint of edge [i,j] and the opposite face of codimension 2, the convex hull of 
vertices {0, 1, ... , m} ~ {i,j}. The transposition oj of vertices i, j is exactly the 
reflection in the hyperplane zy. 


Exercise 13.9 Check that the hyperplanes mj and zy» are orthogonal for {i,j} N 
{k,m} = @ and 4 (sy, mje) = 1/3 fori F k. 


The hyperplanes 2, decompose the simplex A into n! simplices forming the 
barycentric subdivision of A. All these simplices have a common vertex at the 
center of A. Their other vertices fall into the barycenters of the faces of A. The 
faces of A are numbered by subsets in {0, 1, ... , m}. Let [ip,i1, ...,im] C A be 
the m-dimensional face formed by the convex hull of vertices {ig, i1,...,im}. We 
denote its barycenter by c;;, ;,...;,,1. In particular, one-point subsets {i} correspond 
to the vertices cy; = i of A, whereas the whole set {0, 1,...,} gives the 
center Cio,1,...n} = c of A itself. Associated with every permutation g = 
(20,81, ---, 8n) © Sn4i is the n-simplex A, C A with vertices at the barycenters 
Coos CLoo.er}> Cfeoeieot> «+> Cfgo.gi...gn}> the first of which is vertex go of A, the 
second is the midpoint of the edge [go, gi] outgoing from go, the third is the center 
of the triangular face [go, g1, g2| attached to the edge [go, gi], etc. All simplices 
Ag are distinct, and every transformation g € Oa maps each Aj; to Agn. Thus, all 
simplices of the barycentric subdivision of A are bijectively marked by permutations 
g € Sp+1, or equivalently, by orthogonal transformations g : A > A superposing 
the beginning simplex A, = [cy}, C401}, €{0,1,2}5 »- +» Cf0,1,....n—1}> C{0,1,....n}] With 
simplices A,. For each g, we project the (n — 1)-dimensional face of A, opposite 
vertex c from c onto the sphere $”~! Cc A” circumscribed about A. Let us label 
the resulting spherical (n — 1)-simplices by g. We get a triangulation of the cir- 
cumscribed sphere by n! congruent spherical simplices marked by transformations 
g& € Oa in such a way that the transformation g maps the spherical simplex h to the 
spherical simplex gh exactly as took place in Sect. 13.1.4. 

For n = 3, which corresponds to the group S4, we get the triangulation of the 
sphere S” by 24 congruent triangles with angles of 2/3, 2/3, 2/2 shown in Fig. 13.2 
on p. 315. In higher dimensions, the picture is completely similar. 

The beginning simplex e¢ is cut out of the sphere by n hyperplanes z; © x;_, ;, for 
1 <i <n. By Exercise 13.9, they intersect at dihedral angles & (77, 7:41) = 60° 
and 4 (x;,7j;) = 90° for |i — j| > 2. Write o; for the reflection in 7; (note that 


See Sect. 6.5.5 on p. 148. 
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this agrees with notation from Sect. 13.2). The compositions oj;0;4 1 and 0;0;, where 
|i — i| = 2, act identically on the (n — 2)-dimensional intersections 7; 1 741, 
mj M aj. In the perpendicular 2-planes, they act as rotations by angles 120° and 
180° respectively. Therefore, 0), 02, ...,0n, Satisfy the relations o? -_ (0;0;)? = 
(o;0141)* = e for all 1 < i,j <n such that |i —j| > 2. 

We conclude that there is a well-defined homomorphism 


Q: F,/N > Oa, Xi Oi, 


where N <I F’, is the smallest normal subgroup containing the words (13.8). The 
word w = X;,X;. ... X;,, € Fn produces a chain go, g1, ..., 8m of spherical simplices 


def 
Sy = 05, 01, +++ 0} 


v 


with go = e and g, = (w). In this chain, each simplex g, is the reflection of 
the preceding simplex g,_, in the face g,—1(z;,), which matches the i,th face? of 
the beginning simplex e under the superposition of e with g,_; provided by the 
transformation g,-; € O,. To write the word w such that g(w) = g, we join 
some internal points of simplices e, g by a smooth curve transversally crossing all 
codimension-1 faces that it meets at interior points of these faces. Then we walk 
from e to g along the curve and write the numbers i), i2, ..., i, of the sequential 
codimension-1 faces we cross. Clearly, g = 0(%j, Xj, ... Xi,)- 

Any two words in F’; representing the same element g € Og are obtained in 
the way just described from some curves going from e to g. We can deform one of 
these curves into the other within S”~! in such a way that all the time, it remains 
transversal to all codimension-1 walls that it meets. Consider a moment when the 
curve is moved over the wall crossing the locus of codimension 2. Such a locus is 
congruent to some intersection 7 N 7M S"—|_ Tf we project S”"! along 79 It; onto 
the 2-dimensional Euclidean plane perpendicular to 7; N z;, then in a neighborhood 
of the point to which z;  z; is projected we see a picture like that in Fig. 13.3 
on p. 317. Therefore, if 4 (z;, 7) = 90°, then the passage through 2; N 2M S"~! 
replaces some fragment x;x; by the reversed fragment x;x;. Two words obtained from 
each other by such a replacement are equivalent in F3/N, because of the relation 
(x;x;)? =e lf£ (x;, 7) = 60°, then j = i + 1 and all possible mutations of the 
word agree with the relation (x;x;41)? = e. We conclude that two words mapped by 
gy to the same g are always equivalent modulo the relations (13.8), i.e., 


g:F,/N > Sp4i 


is an isomorphism. 


'3Which is cut out of the sphere by the reflection plane 7, . 
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Exercise 13.10 Choose some internal points a, b in the spherical simplices e, g 
such that b is not opposite to a and the short geodesic!’ joining a and b crosses 
walls of codimension | at internal points of the faces of the spherical simplices. Use 
this geodesic to write a word w € F3 such that y(w) = g. Show that the length of 
this word does not depend on the agreed-upon choice of a, b and that g cannot be 
represented by a shorter word. 


13.2.2. Bruhat Order 


Recall'> that J(g) for g € S41 means the inversion number of g, that is, the total 
number of ordered pairs (i,j) such that 0 < i <j < n and g(i) > g(j). We are going 
to show that /(g) is the minimal length of the expression g = 0;, - 0;, «++ 0;,,, where 
iy € {1, 2, ... , m} and o; denotes the transposition of sequential elements (i — 1), i 
as before. For this reason, /(g) is also called the /ength of the permutation g. 


Exercise 13.11 Verify that 0 < J(g) < n(n + 1)/2 in S,41 and the only 


permutations with inversion numbers 0 and n(n + 1)/2 are respectively 


Id=(0,1,...,”) and 8“(n, (n—1),..., 1,0). 


For every g € S,4; andi = 1, 2, ... , n, the permutation go; is constructed from g 
by swapping the (i — 1)th symbol g(i — 1) with the ith symbol g(a) : 


(g(1), .-. , s@—2), g@-1), g@, gG@+ 1), ..., gn) 09; 
= (g(1), ..., e@—2), g@, g@—-D, eG + 1), -.., e@) - 
Thus, /(go;) = I(g) + 1 for g(i— 1) < g(i) and I(go,)) = I(g) — 1 for 


g(i— 1) > g(i). Since right multiplication by o; can increase I(g) by at most 1, 
every decomposition g = 0j, - 0;, --+ 0;,, has length m = I(g). 


Exercise 13.12 Check that if /() > 0, then there exist sequential elements i, i+ 1 
such that h(i — 1) > A(i), and for this i, the equality /(ho;) = I(h) — 1 holds. 


The exercise implies that there exists a chain of right multiplications by appropriate 
o; that kill some inversion pair of sequential elements in each step. Such a chain is 
forced to reduce g down to e in exactly /(g) steps: 


& b> 80;, |> 80j,02 > +++ F> 80; Oi, *** Cin) =e. 


'4That is, the shortest of two arcs of the great circle cut out of S”~! C A” by the 2-dimensional 
plane passing through a, b, and the center of sphere. 


See Sect. 9.2 on p. 208. 


322 13 Descriptions of Groups 


Therefore, g = Oi,,) Oi.) ; ***0i,- This proves once more the surjectivity of the 
homomorphism (13.7) and shows that /(g) coincides with the minimal length of 
words mapped to g by (13.7). 

Every word w € F,, of length /(g) such that p(w) = g is called a minimal word 
for g. The decomposition g = p(w) = 0;, -0;, +++ Oiy,, Obtained from (any) minimal 
word w = Xj, + Xi, +++ Xiyg, for g is called a shortest decomposition of g. Note that 
both are not unique in general. 

Let us write g < hifh = goj, - 0;, --+ 0;,, where k > 0, and 


I(g0i, “Oi, *7* 0i,) = I(goj, ‘Oi, *°* 0i,_,) +1 


for all v = 1,2,...,k, i.e., each multiplication by the next o;, swaps some 
noninverse pair of sequential elements. The binary relation g < h, meaning that 
either g = hor g < h, provides S,4, with a partial order!® called the Bruhat order. 


Exercise 13.13 Convince yourself that the binary relation g < h is reflexive, skew- 
symmetric, and transitive. 


For every shortest decomposition g = 0j, - 0;, «++ Oi,,,, the initial segments 


8v = 05, 0j, *** Oi, 
form a strictly increasing sequence with respect to the Bruhat order. It begins with 
go = e, ends with g,, = g, and each g, is obtained from g,_; by the transposition of 
some noninverse pair of sequential elements g,_1 (i, —1) < gy-1(i,). The injectivity 
of g : F,/N —> S,+1 follows from the next claim. 


Proposition 13.4 Every word w € F,, is equivalent to some minimal word for the 
permutation g(w) € S,4 1 modulo the relations x = 6, XX 1X; = Xi41XiXi41, and 


XjXj = xjx; for |i — j| > 2. Any two minimal words for p(w) are also equivalent. 


Proof Write €(w) for the length of the word w and use induction on £(w). For 
L(w) = 0, i.e., w = ©, the statement is trivial. Assume that it is has been proved for 
all words w of length £(w) < m. It is enough to verify the statement for every word 
wx, such that £(w) < m. Let g = g(w). Then g(wx,) = goy. If w is not minimal for 
g, then by the induction hypothesis, w is equivalent to some shorter word. Therefore, 
the word wx, is also equivalent to some shorter word u. By the induction hypothesis, 
u is equivalent to some minimal word for g(u) = go,, and all minimal words for 
gO, are equivalent to each other. Thus, the statement is true in this case. 

Now assume that w is minimal for g. There are two possibilities: either g(v—1) > 
g(v) or g(v — 1) < g(v). In the first case, g has a minimal word of the form ux,, 
which is equivalent to w by the induction hypothesis. Then wx, ~ uxyx, ~ U, 
where ~ means equivalence of words modulo N < F,,. Therefore, the permutation 
y(wx,) = g(u) is represented by a word u equivalent to wx, that is strictly shorter 


'6See Sect. 1.4 on p. 13. 


13.2 Presentation of the Symmetric Group 323 


than wx,. Applying the induction hypothesis to u, we conclude that u is equivalent 
to some minimal word for g(wx,), all minimal words for g(wx,) are equivalent to 
each other, and the statement holds. 

Now assume that g(v — 1) < g(v). In this case, I(go,) = I(g) + 1, and wx, is 
a minimal word for g(wx,). Thus, we have to show that every other minimal word 
w’ for g(wx,) is equivalent to wx,. Let us consider sequentially three alternative 
possibilities for the rightmost letter of w’: either it is x,, or it is x,41, or it is Xs 
where |j4 — v| > 2. In the first case, w’ = ux,, where u is a minimal word for g. 
Since w is a minimal word for g as well and €(w) = €(u) < m, the induction 
hypothesis implies that u ~ w. Therefore, w’ = ux, ~ wx; as well. 

Now consider the second case, assuming that w’ = ux,4, (for w’ = ux,—, the 
argument is completely symmetric). Since both words wx,, ux +41 are minimal for 
goy, the permutation go, sends a triple of sequential elements v — 1, v, v + 1 to 


a(v) > g(v—1) > gv +- I), 


whereas the permutation g sends them to 


g(v—1) < g(v) > gv +1). 


Therefore, go, has a minimal word of the form sx,+1%,x »+41, and g has a minimal 
word of the form tx,x,+4; (because of g(v — 1) > g(v + 1)). The permutation 
h = ¢(s) = g(t) sends elements v — 1, v, v + 1 to 


g(v + 1) < g(v— 1) < gv) 


and has [(h) = I(go,) — 3 = I(g) — 2. Thus both words f, s are minimal for / and 
are equivalent by the induction hypothesis. At the same time, w ~ tx,x,41. Hence, 
WXy ~ XyXypepXy WY SXpXypiXp ~ SXy41XyxX,4+1. On the other hand, sxy41x, ~ u, 
because both words are minimal for the same permutation, which sends v — 1, v, 
v+1to 


gv) > gv + 1) < gv —1) 


(where g(v) > g(v — 1)) and has inversion index /(go,,) — 1. We conclude that 
WX) ~ UXy41, as required. 

Finally, let go, = y(wx,) = g(ux,), where | — v| > 2. Then two disjoint pairs 
of sequential elements v — 1, v and yz — 1, yz are sent by go, to g(v — 1) > g(v) and 
g(t — 1) > g(). Therefore, go,, has minimal words looking like tx,x, and sx, xy, 
where both ¢, s are minimal for the permutation h = y(t) = y(s), which sends pairs 
v—1,vand pw —1, pw to g(v) < g(v — 1) and g(x) < g(u — 1) respectively. By 
the induction hypothesis, t ~ s, because [(h) = I(go,) —2 = m—1, andw ~ txy, 
because both words are minimal for g. A similar argument shows that sx, ~ u. (Both 
words sx,, u are minimal for the permutation g(sx,) = g(u), which differs from h 
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by a transposition in the first of two pairs and has inversion number /(go,,)—1 = m.) 
We conclude that wx, ~ t,x) ~ SXyXy ~ SXpXy_ ~ UX,, as required. oO 


13.3 Simple Groups and Composition Series 


A group G is called simple if G has no normal subgroups different from {e} and G. 
For example, every finite group of prime order is simple, because its only subgroups 
are the trivial group of one element and the entire group by Lagrange’s theorem, 
Theorem 12.3. Since normal subgroups are exactly the kernels of homomorphisms, 
a group G is simple if and only if every group homomorphism G — G’ is either 
injective or trivial, i.e., takes the whole group G to the unit of G’. 


13.3.1 Jordan—Holder Series 


A finite strictly decreasing sequence of subgroups 


G=Go2652632:::2G-12G,= {te} 


is called a Jordan—Hoélder (or composition) series of the group G if for each i, 
the subgroup Gj; is normal within G; and the quotient group G;/G;+, is simple. 
In this case, the quotient groups G;/Gi41, 0 < i < n-— 1, are called Jordan— 
Holder (or composition) factors of G. The cardinality n of the total collection of 
all Jordan—Holder factors (where repeated elements are allowed as well) is called 
the composition length'’ of the group G. A group allowing a finite composition 
series is called a finite-length group. 


Example 13.1 (Composition Factors of S4) We have seen above that S4 admits the 
following composition series: 


Sy Ago V4 > Z/(2) > {e}, 


where A, <I S4 is the subgroup of even permutations, V4 <] Aq is the Klein four 
group,'® and Z/(2) < V4 ~ Z/(2) ® Z/(2) is any of three cyclic subgroups of 
order 2 spanned by nonunit elements. Therefore, S4 has four Jordan—Hdlder factors: 
Z/(2) = Ss/Ag, Z/(3) = Aa/Va, Z/(2) = Va/(Z/(2)), and Z/(2) = Z/(2)/f{e}. 
Thus, the composition length of S4 is 4. Note that three of the four Jordan—H6élder 
factors of S4 coincide. 


Or just length for short. 
'8Consisting of the identity and three pairs of disjoint transpositions of cyclic type H4 
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Exercise 13.14 Check that Ay/V4 ~ Z/(3). 


Theorem 13.1 (Jordan—Holder Theorem) /f a group G admits a finite composi- 
tion series, then the nonordered total collection of all Jordan—Holder factors does 
not depend on the choice of composition series. In particular, the composition length 
is well defined for every group of finite length. 


Proof Let the group G have two Jordan—H6lder series 


G=Po92Pi 2 P22 °°: DPi-1 2 Pn = {e}, (13.9) 
G=%H 222 °+ 2 Qn-1 2 Qn = fe}. (13.10) 


Our plan is to insert some chains of nonstricly decreasing subgroups between 
sequential elements of both series in such a way that the resulting two collections 
of sequential quotients will be in a natural bijection such that all the corresponding 
quotients are isomorphic. Since each quotient is either zero or some Jordan—H6lder 
factor of G, this leads to a one-to-one correspondence between the composition 
factors coming from (13.9) and (13.10). 

It follows from Proposition 12.5 on p. 302 applied to the normal subgroup Pj+; <J 
P; and just a subgroup Q, MP; C P; that for each i, there exists a chain of subgroups 


Pi > (Q10Pi)Pit1 2 (Q2NPi)Pit41 D +++ D (Om-1NPi)Pi41 2 Pi41, (13.11) 


which starts from P;, ends with Pj+;, and has (Qy41 1 P;)Pi41 <) (Qe MN Pi)Pi+1 
with 


(Qe Pi)Piti (QO Pi) 


ee (13.12) 
(Qc41 OP )Pit1 — (Qet1 1 Pi)(Qk 1 Pi41) 


Exercise 13.15 (Zassenhaus’s Butterfly Lemma) Let some group have four 
subgroups A, B,C, Dsuch that A <] B and C <j D. Deduce from Proposition 12.5 on 
p. 302 that there exists an isomorphism (BND)C/(AND)C ~ (BND)/(AND)(BNC) 
and use it to prove (13.12). 


The subgroup P+; is normal within all subgroups of the chain (13.11). Taking 
quotients, we get 


P; > (Q) 9 Pi)Pi+1 = (Q2 1 Pi)Pi+1 a (Qm—1 N Pi) Pi+1 


> > > > {e}, 
Pitt Pi+1 Pi+1 Pi+1 


(13.13) 
where each subgroup is normal within the preceding one, and the sequential factors 


(Qk O Pi)Piti/Pitt (Qe OPi)Pit (Qi M Pi) 
(Qrt1 A P)Piti/Pit1 (Qeri ON Pid)Pit1 ~~ (Qeai OM Pi)(Qe N Pi+1) 
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coincide with (13.12). Since the quotient group P;/P;+, is simple, all inclusions 
in the chain (13.13) are equalities except for exactly one strict inequality; that is, 
all sequential quotients in (13.12) are trivial except for exactly one, isomorphic 
to P;/Pi+1. 

The same argument applied to Q instead of P allows us to insert a nonstrictly 
decreasing chain of subgroups 


Ox D> (P11 Qi) Ox41 D (P22 Qi)On+1 D +++ D (Pn-1 * Op On41 D Ont 
(13.14) 


between any two sequential elements Q; > Q;+4, in the composition series (13.10). 
In (13.14), each group is a normal subgroup within the preceding, and the sequential 
quotients 


(PiN OQ)O+1 (Ox. Pi) 
(Pitt V1 On)On+1 — (Qk+1 1 Pi)(Qk N Pi+1) 


are isomorphic to the corresponding quotients in (13.12). Thus, after insertion of 
chains (13.11), (13.14) between neighboring members of the series (13.9), (13.10), 
we get two chains of equal length equipped with a natural bijection between 
sequential quotients such that the corresponding factors (13.15) and (13.12) are 
isomorphic. Since Q;+ is anormal subgroup of all groups in (13.14), we can factor 
the chain (13.14) through it and apply the same arguments as used for P;+; and 
the chain (13.11). They show that for every fixed k, there is precisely one nonunit 
quotient among the factors (13.15), and it is isomorphic to Ox /Qi+1. oO 


(13.15) 


Remark 13.1 A group of finite length may have many different composition series 
in which the Jordan—HOlder factors of the group may appear in different orders. 
Furthermore, two finite-length groups with equal collections of Jordan—H6lder 
factors are not necessarily isomorphic. 


13.3.2 Finite Simple Groups 


One of the foremost mathematical achievements of the twentieth century was 
completing the list of all finite simple groups. It consists of several infinite series and 
26 so-called sporadic simple groups lying outside the series.!? The infinite series 
are of three types: cyclic additive groups Z/(p) of simple order, even permutation 
groups A, for” n > 5, and the simple linear algebraic groups over finite fields.”! The 


The Mathieu groups M1, Mi2, M22, Mo3, Moa, but not Mio are among them (see Problem 12.34 
on p. 306). 

0However, A3 ~ Z/(3) is also simple. 

*1Such as PSL, (F,). Explicit definitions and classifying theorems for these groups can be found 
in textbooks on linear algebraic and/or arithmetic groups, e.g. Linear Algebraic Groups, by 
James E. Humphreys [Hu]. 
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enumeration of all finite simple groups was accomplished by collecting the results of 
hundreds of papers written by scores of researchers who had been working in diverse 
directions.” The last gaps were filled in only in 2008. A more or less universal 
viewpoint that could treat the classification of finite simple groups uniformly has 
yet to be developed. Below, we will prove the simplicity of the even permutation 
groups A, forn = 5. 


Lemma 13.1 The group As is simple. 


Proof Two permutations are conjugate in $5 if and only if they have equal cyclic 
types. The cyclic types of even permutations are 


Cocco. H HH. ! (13.16) 


ie., 5-cycles, 3-cycles, pairs of disjoint transpositions, and the identity. The 
corresponding conjugacy classes in S5 have cardinalities 5!/ 5 = 24, 5!/(3-2) = 20, 
Bl (27-2) = 15, and 1, 


Exercise 13.16 Verify that every permutation of the last three types commutes with 
an odd permutation. 


Therefore, the permutations of the last three types are conjugate in S5 if and only if 
they are conjugate in A5. Hence, the last three classes are conjugacy classes within 
As as well. Cycles of length 5 split into two conjugacy classes within As: 12 cycles 
conjugate to |1,2,3, 4,5) and 12 cycles conjugate to |2, 1,3, 4,5). 


Exercise 13.17 Check this. 


Every normal subgroup H <i As either contains the entire nonunit conjugacy class 
or is disjoint from it. Therefore, |H| = 12¢; + 12¢2 + 20e3 + 15¢4 + 1, where each 
coefficient ¢, equals either 1 or 0. At the same time, |H| divides |A5| = 60 = 3-4-5. 


Exercise 13.18 Verify that this is possible only in two cases: either for all e, = 1 
or for all e, = 0. 


Thus, the normal subgroups in As are exhausted by the trivial subgroup consisting 
of the identity and the whole of As. Oo 


Theorem 13.2 The groups A, are simple for all n = 5. 


Proof By induction on n. Let N <1 Ay. The stabilizer Stab,, (k) of an element k is 
clearly isomorphic to A,_;. By the induction hypothesis, the subgroup 


NN Staby,(k) <I Staby, (k) 


2The final part of the story is expounded in a six-volume manuscript [GLS]. 
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is either {e} or the whole of Stab,,(k). Since the stabilizers of all elements are 
conjugate, the subgroup N either contains all stabilizers or intersects each stabilizer 
trivially. In the first case, N contains all pairs of disjoint transpositions, and therefore 
N=Ay,. 


Exercise 13.19 Verify that A, is generated by the pairs of disjoint transpositions. 
In the second case, for every i # j, there is at most one g € N such that g(i) = j. 


Exercise 13.20 For n > 6, let g € A, take g(i) = j # i. Show that there exists 
h # g that is conjugate to g in A, and takes h(i) = j as well. 


Since N is normal, it cannot have any nonidentity permutations, i.ec., NV = {e}. O 


13.4 Semidirect Products 


13.4.1 Semidirect Product of Subgroups 


Recall that we put NH © {xh | x € N, h € H} for subsets N, H C G. When N and 
H are subgroups of G, the multiplication map N x H — NH, (x, h) > xh, becomes 
bijective if and only if NH = {e}. Indeed, if NN H = {e} and xh; = xh, 
then ee = hoh{! € NOH is forced to be e, and therefore x. = x;, hy = hy. 
Conversely, if N  H contains some z # e, then distinct pairs (e, e) and (z, z!) are 
both mapped to e. 

Two subgroups NV, H C Gare called complementary if NOH = {e} and NH = G. 
In this case, every element g € Gcan be uniquely factored as g = xh. If, in addition, 
the subgroup N <I G is normal, then we say that G is the semidirect product of N 
and H, and we write’? G = N » H. In this case, 


(x11) (x2h2) = x (hix2h{') . hyhy in G, 


where x1 (hyx2h7') € N, hih2 € H. Thus, we can treat the composition in G as a 
composition on the set N x H, which differs from the componentwise composition 
within N and H and is defined as 


(x1, 1) + (X2, ho) © (1 Addn, (x2), Ah), (13.17) 


where Ad, :N = N, xt} hxh!, means the adjoint action2* of an element h on N. 
If the adjoint action of H on N is trivial, i.e., xn = xh for every x € N,h € H, then 
the semidirect product becomes the direct product with the usual componentwise 
composition (x), 11) + (2, hz) = (x12, hy hz). 


>3The symbol ™ should serve as a reminder that N << N XH. 
4See Example 12.14 on p. 295. 
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Example 13.2 (D, ~ Z/(n)xZ/(2)) The dihedral group D,, contains the normal 
subgroup of rotations, which is isomorphic to the additive group Z/(n). The cyclic 
group of order 2 spanned by a reflection is complementary to the subgroup of 
rotations and is isomorphic to the additive group Z/(2). The adjoint action of a 
reflection on a rotation changes the direction of the rotation. After identification 
of rotations and reflections with residue classes, this action becomes an action 
of Z/(2) on Z/(n) such that the classes [0], [1]2 € Z/(2) act as multiplication 
by +1 and —1 respectively. Thus, D, = Z/(n) »Z/(2), and in terms of residue 
class pairs (x,y) € Z/(n) x Z/(2), composition in D, is described by the formula 
(1, yi) + (2, y2) = (1 + (— 1)?! x2, v1 + y2), where x1, x2 € Z/(n), y1,y2 € Z/(2). 


Example 13.3 (Aff(V) ~ V=GL(V)) We have seen in Example 12.21 on p.301 
that the group Aff(V) of affine automorphisms of the affinization A(V) of a vector 
space V contains a normal subgroup of shifts t) : x + x + v, which is isomorphic 
to the additive group of V. On the other hand, the stabilizer Stabasiy)(p) of a 
point p € A(V) is identified with the general linear group GL(V) by means of 
the vectorization map vec, : A(V) & V, g pq, with the origin at p. Since 
the stabilizer Stabary)(p) does not contain nonzero shifts and every affine map 


F : A(V) = A(V) can be decomposed as F = 1, 0G forv = pF@) e€ V and 
G = t_, oF € Stabamy)(p), we conclude that the affine group Aff(V) is the 
semidirect product of the subgroup of shifts V < Aff(V) and the general linear 
group GL(V) embedded in Aff(V) as the stabilizer of some point p € A(V). By 
Exercise 12.27 on p.301, the adjoint action of GL(V) on shifts coincides with 
the tautological action of linear maps on vectors. Therefore, in terms of the pairs 
(v,F) € V x GL(V), the composition in Aff(V) ~ V»GL(V) is given by the 
formula (u, F) - (w, G) = (u+ F(w), FG). 


13.4.2 Semidirect Product of Groups 


A composition of type (13.17) can be defined for any two abstract groups N, H, 
not necessarily given as complementary subgroups of some ambient group. In this 
general setup, instead of the adjoint action, we consider an arbitrary action of a 
group H on a group G by group automorphisms, i.e., any group homomorphism 


w:H>AuNn, hey: NON. (13.18) 


Given such an action, we equip the direct product of sets N x H with the binary 
operation 


(x1,/)- (x2, hr) = (x1 Wn, (x2), Ahr) - (13.19) 
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Exercise 13.21 Check that the composition (13.19) provides N x H with a group 
structure with unit (e, e) and inverses given by (x,h)~! = (w,'(x!),A7'), where 
WwW, | = W-1 : N > Nis the group automorphism inverse to yr, : N > N. 


The resulting group is called the semidirect product of the groups N, H with respect 
to the action y : N — AutH and is denoted by N xy H. Let me stress that the 
semidirect product actually depends on the choice of y. For example, if y is trivial, 
that is, yj, = Idy for all h, then Nxy H = N x H is the usual direct product with 
componentwise composition. 


Exercise 13.22 Check that for every semidirect product G = N™»y,H: 
(a) N’ = {(x, e) |x € N} is anormal subgroup of G isomorphic to N, and G/N’ ~ H, 
(b) H’ ={(e,h)|h€H} is the subgroup of G complementary to N’, and G = 
N’xH’. 


13.5 p-Groups and Sylow’s Theorems 
13.5.1 p-Groups in Action 


Every finite group of order p", where p € N is prime, is called a p-group. Every 
subgroup of a p-group is itself a p-group by Lagrange’s theorem.” In particular, the 
stabilizer of a point in an arbitrary action of a p-group is a p-group, and therefore, 
the length of every orbit is either divisible by p or equal to 1. This leads to a simple 
but very useful claim: 


Proposition 13.5 Every action of a p-group G on a finite set X such that p + |X| 
has a fixed point. 


Proof Since the length of every orbit cannot be divisible by p, there is some orbit of 
length 1. oO 


Proposition 13.6 Every p-group G has a nontrivial center 


Z(G) = {cE G| Vg € Gcg = gc}. 


Proof The center Z(G) is the fixed-point set of the adjoint action®° of G on itself. 
Since the lengths of all orbits in G~ Z(G) are divisible by p, it follows that |Z(G)| = 
|G| — |G ~ Z(G)| is divisible by p and positive, because e € Z(G). Oo 


Exercise 13.23 Show that every group of order p” is abelian. 


5See Theorem 12.3 on p. 300. 
6See Example 12.14 on p. 295. 
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Let G be an arbitrary finite group. Write its order as |G| = p"m, where p,n,m € N, 
p is prime, and GCD(m,p) = 1. A subgroup S C G of order |S| = p” is called 
a Sylow p-subgroup. The total number of the Sylow p-subgroups in G is denoted 
by N,(G). 


Theorem 13.3 (Sylow’s Theorem) For every finite group G and prime divisor p 
of |G| there exists a Sylow p-subgroup in G. Every p-subgroup of G is contained in 
some Sylow p-subgroup. All Sylow p-subgroups are conjugate. 


Proof Let |G| = p"m, where GCD(m, p) = 1 as above. Write € for the set of all 
subsets of cardinality p" in G. The group G acts on € by left multiplication: an 
element g € G maps X > gX = {gx | x € X}. For every F € €, F C G, the 
stabilizer Stab(F) = {g € G | gF = F} acts on the set F by left multiplication. 
This action is free, because gix # gox for gi # go in the group G. Since each 
orbit consists of |Stab(F)| points, p” = |F| is divisible by |Stab(F)|. We come to 
the following two alternatives for the length of a G-orbit of a point F € € under the 
action of G on €: it is either divisible by p or equal to m. 

If the second case occurs for some F, then |Stab(F’)| = p”, i.e., Stab(F) C G is 
a Sylow p-subgroup. By Proposition 13.5, the action of any p-subgroup H C Gon 
the length-m orbit of F has some fixed point gF. This means that H C Stab(gF) = 
g Stab(F) g!, i.e., H is contained in some Sylow p-subgroup conjugate to Stab(F). 
If H itself is Sylow, i.e., |H| = p”, then this inclusion is an equality. 

If the first case occurs for all F € €, then the lengths of all orbits in the action 
G: € are divisible by p. This is impossible, because |€| = ‘é Lg = m(mod p) is 
coprime to p by Exercise 2.10 on p. 29. oO 


Corollary 13.1 (Addendum to Sylow’s Theorem) N,, | m and N, = 1 (mod p) 
for every prime divisor p of |G]. 


Proof Write S for the set of all Sylow p-subgroups in G and consider the adjoint 
action of G on S, in which g € G maps H + gHg7!. This action is transitive 
by Sylow’s theorem. Therefore, |S| = |G|/|Stab(P)| for every P € S. Since P C 
Stab(P), we conclude that |P| = p” divides |Stab(P)|. This forces |S| to divide 
|G|/p" = mand proves the first statement. To prove the second it is enough to check 
that the adjoint action of a p-group P on S has exactly one fixed point, namely P 
itself. In this case, the lengths of all the other orbits are multiples of p, and we 
obtain the required congruence |S| = 1 (mod p). Let H € S be fixed by P. Then 
P c Stab(H) = {g € G | gHg™! C Ht}. By Lagrange’s theorem, the inclusions 
H C Stab(A) C G force |Stab(H)| = p"m’, where m’ | m and GCD(m’', p) = 1. 
Therefore, both P and H are Sylow p-subgroups of Stab(H). Since all Sylow p- 
subgroups are conjugate and H is normal in Stab(H), we conclude thatH = P. O 
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Example 13.4 (Groups of Order pq for GCD(p — 1,q) = 1) Let |G| = pq, 
where p > q and both p, q are prime. Then G has exactly one Sylow p-subgroup 
H, = Z/(p), and it is normal. Every Sylow q-subgroup H, ~ Z/(q) has Hp NH, = 
e, because both groups H, and H, are simple. Hence, the multiplication map 


H, x Hy, > HyH, C G 


is injective, and therefore H,H, = G by a cardinality argument. We conclude that 
H, is complementary to H,. By Sect. 13.4, this forces 


G=H,™H, = Z/(p) xy Z/(q) 


for some action y : Z/(q) > Aut(Z/(p)). 


Exercise 13.24 Verify that for every coprime m,n € Z, there are no nontrivial 
homomorphisms of additive groups Z/(m) > Z/(n). 


By Exercise 12.4 on p. 280, Aut(Z/(p)) ~ F>. For Gcp(p — 1,q) = 1, the above 
exercise implies that every action of Z/(qg) on Z/(p) is trivial. Hence, 


G~Z/(p) ® 2/9) 


if g is coprime to p — 1. 


Example 13.5 (Groups of Order 2p) Let |G| = 2p for prime p > 2. The same 
arguments as in the previous example show that G = Z/(p) xy Z/(2) for some 
action y : Z/(2) + Aut(Z/(p)) ~ F>. The latter is completely determined by an 
element y((1]) € F such that w({1])? = 1. There are exactly two such elements: 
w({1]) = 1 and w([1]) = —1. For the first choice, the action w is trivial, and 
G ~ Z/(p) ® Z/(2). In the second case, the classes [0]2,[1]2 € Z/(2) act on 
Z/(p) as multiplication by +1 and —1 respectively. Thus, G ~ D, by Example 13.2 
on p. 329. 


Problems for Independent Solution to Chap. 13 


Problem 13.1 Show that for n > 3, the group A, is generated by the 3-cycles 


[k=2,k= 1,0, 3 2k <n, 
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Problem 13.2 Enumerate all conjugacy classes in A¢ and indicate their cardinalities. 

Problem 13.3 Which conjugacy classes of the even permutations in S, split into 
several distinct conjugacy classes within A,,? 

Problem 13.4 Show that the center of S,, is trivial for n > 3. 

Problem 13.5 (Simplicity of SO3) Let G = SO3(R) be the rotation group’ of the 
Euclidean space R3. For every v € R? and g € R, write Ry» € G for rotation 
about the axis spanned by v by the angle ¢ in the clockwise direction as it looks 
in the direction of v. Check that FR, oF = Rpy g for all F € Gand use this to 
prove that the group G is simple. 


Problem 13.6 Show that the complete group Oc: of the standard n-cocube”® 


Cc" Cc R’: 
(a) is presented by generators x), x2, ...,X, and relators a — (xix;)*, (xp—14%)?, 
(Xn—1Xn)* for all 1 < i,j < n such that |i —j| > 2 and all 2 < k <n—1, where 
the generators act on C” by reflections 01,02, ...,0, in hyperplanes z, = et 


and i= (e; —e41)1, 1 <i< (n— 1). 
(b) is isomorphic to the semidirect product (Z/2)" »S,, where S,, acts on (Z/2)” 
by permutations of coordinates. 


Problem 13.7 Describe the automorphism groups of the following groups: 
(a) Z/(n), (b) Z/(2) x Z/(2), (©) D3, (d) Da, (€) Qs. 


Problem 13.8 Which groups in the previous problem have no outer automorphisms? 


Problem 13.9 Let two actions g, y : H — AutN be related as y = Ad, oy for 
some g € N. Show that N x, H ~ Nx, H. 


Problem 13.10 Let p be the smallest prime divisor of |G|. Show that every subgroup 
of index p in G is normal.” 


Problem 13.11 Show that every group of even order contains an element of order 2. 

Problem 13.12 For a group of order p”, show that every subgroup of order p*, k <n, 
is contained in some subgroup of order p‘*!. 

Problem 13.13 Is there a simple group of order 12? 

Problem 13.14 Describe all groups of order < 15 up to isomorphism. 


?7See Example 10.13 on p. 247. 
8See Problem 10.13 on p. 250. 


°For example, every subgroup of index 2 is normal, every subgroup of index 3 in a group of odd 
order is normal, etc. 
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Problem 13.15 Show that there are exactly three mutually nonisomorphic nondirect 
semidirect products Z/(8) x Z/(2). 


Problem 13.16 Describe all groups of order 55. 

Problem 13.17 Use the action of G on its Sylow 5-subgroups to show that As is the 
only simple group of order 60. 

Problem 13.18 Give an example of two nonisomorphic groups G; #% G2 and normal 
subgroups H; <1 Gi, Hz <1 G2 such that Gi/H, ~ G2/HA2. 


Chapter 14 
Modules over a Principal Ideal Domain 


In this chapter, K by default means an arbitrary commutative ring with unit and k 
means an arbitrary field. A K-module always means a unital module over K. 


14.1 Modules over Commutative Rings Revisited 


Recall! that a unital module over a commutative ring K with unit is an additive 
abelian group M equipped with multiplication by constants 


KxM—M 
possessing all the properties of multiplication of vectors by scalars in a vector space 
that were listed in Definition 6.1 on p. 123. An abelian subgroup N C M is called 
a K-submodule if Aa € N for all A € K and all a € N. Submodules N ¢ M are 


called proper. If N C M is a submodule, then the quotient module M/N consists of 
congruence classes 


[my = m(mod N) =m+N={m' €M|m' —meN}, 
which can be viewed as equivalence classes modulo the relation m ~ m’, meaning 
that m’—m € N. These classes are added and multiplied by constants by the standard 


rules 


[my] + [mg] = [m, +m] and Alm] = [Am]. 


‘See Definition 6.2 on p. 124. 
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Exercise 14.1 Check that both operations are well defined and satisfy all axioms 
from Definition 6.1 on p. 123. 


A homomorphism of K-modules? is a homomorphism of abelian groups g : M > N 
that respects multiplication by scalars, i.e., such that p(Av) = Ag(v) for all AK 
and all v, w € M. Homomorphisms of modules possess all the properties valid for 
homomorphisms of abelian groups. In particular,? (0) = 0, g(v — w) = g(v) — 
y(w) for all v, w € M; all nonzero fibers gp !(g(v)) = v + ker@ are congruence 
classes modulo the kernel ker g = {a € M | g(a) = 0}; the image img = g(M) is 
isomorphic to M/ ker g; etc. In particular, g is injective if and only if kerg = 0. 
All K-linear maps M — N forma K-module, denoted by Hom(M, N). If a precise 
reference to the ground ring K is needed, we write Homg(M, N). The addition of 
homomorphisms and their multiplication by constants are defined by the same rules 
as for linear maps of vector spaces: given f,g : M — N anddA,p € K, then 


Af + wg: mt Af(m) + wg(m). 


14.1.1 Free Modules 


Recall that a subset E C M is called a basis of M if each vector w € M has a unique 
linear expression w = )\ e € Ex-e, where x, € K and all but a finite number of the 
Xe are equal to zero. 


Exercise 14.2 Check that a set of vectors is a basis if and only if it is linearly 
independent and spans the module. 


If a module has a basis, it is called a free module. For example, all coordinate 
modules K” are free for every ring K. All vector spaces over a field k are free 
k-modules by Theorem 6.1 on p. 132. In Sect. 6.2 on p. 127, we saw some other 
examples of free modules. Elements of a free module with basis E can be thought 
of as functions x : E — K with finite support. 


Lemma 14.1 A subset E C M is a basis of a K-module M if and only if for every 
K-module N and every map of sets p : E — N, there exists a unique homomorphism 
of K-modules fy : M — N that extends @. 


Proof If E is a basis, then every K-linear map f, : M — N that maps e +> g(e) 
should take f ()> e € Ex.e) = )\e € Ex.g(e). On the other hand, the prescription 
(Xe)eex te doe € Ex.p(e) actually defines a K-linear map from a free module with 
basis E to N. 


Exercise 14.3. Check this. 


? Also called a K-linear map. 
3See Sect. 2.6 on p. 31. 
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Now assume that E C M satisfies the second condition and write N for the free 
K-module with basis E. Then the identity map Idg : E — E is uniquely extended 
to K-linear maps f : M — N and g: N — M. Since composition gf : M — M 
acts identically on E C M, it coincides with the identity map Idy by our assumption. 
Since N is free, the composition fg : N — N coincides with Idy for the same reason. 
Thus, f and g are mutually inverse isomorphisms, i.e., M ~ N is free. oO 


Exercise 14.4 Show that Hom(K”, N) ~ N®” (direct sum of m copies of N). 


14.1.2. Generators and Relations 


If the ground ring K is not a field, then not all K-modules are free. A common way 
to represent a nonfree K-module M is to choose linear generators for M and list all 
the linear relations between them. Then the vectors of M can be treated as linear 
combinations of the chosen generators considered up to the listed linear relations. 
This is formalized as follows. 


Associated with a collection of vectors w = (w1,W2,...,Wm) € M is a linear 
map 
Ty: K"” > M, ei wi, (14.1) 
which sends (x), X2,...,Xm) € K” to the linear combination 


XW, + XW. He +XmWm € M. 


The vectors w1,W2,...,Wm are linearly independent if and only if zy, is injective. 
They span M if and only if , is surjective. In the latter case, M ~ K"™/Ry, where 
R, © ker zy is a submodule formed by the coefficients (x1, %2,...,Xm) of all linear 
relations xjw,; + x2w2 +---+XnWm = 0 in M. For this reason, R,, is called a 
module of relations between generators w, w2,..., Wm. Note that by construction, 
R, is a submodule of a free module. Thus, an arbitrary finitely generated K-module 
over a ring K can be written as K"/R for appropriate m € N and R C Kk". 


Example 14.1 (Ideals) Every commutative ring K is clearly a K-module. A subset 
IC K isasubmodule if and only if J is an ideal of K. A principal ideal (a) C K isa 
free K-module if and only if the generator a is not a zero divisor. Every nonprincipal 
ideal is generated by at least two elements, which are linearly related, because any 
two elements a,b € K are linearly related, e.g., as a-b — b-a = O. For example, 
the ideal J = (x,y) C Ql|x, y] considered as a module over the polynomial ring 
K = Qjx, y] cannot be generated by one element, because x, y have no nonconstant 
common divisors. The epimorphism zy) : K? > 1, (f,g) xf + yg, associated 
with the generators x, y has free kernel Ri.) = ker 2(,.) with basis (y, —x), because 
the equality xf = —yg forces f = yh, g = —xh for some h € QIx, y], since Q[x, y] 
is factorial, and therefore every K-linear relation between x and y is proportional to 
(y, —x). We conclude that I ~ K?/K - (y, —x) as a K-module. 
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Example 14.2 (Abelian Groups) Every abelian group A has a canonical Z-module 
structure defined by 


(tn)-a# +(at+a+t+---+a) 


n summands 


forné€ N,aeéA,and0-a=O0forallaec A. 
Exercise 14.5 Verify the axioms from Definition 6.1 on p. 123. 


For the additive group of residue classes M = Z/(k), such multiplication by integers 
yields n- [m], = [nm], , where [x], = x (mod k) denotes the residue class of x € 
Z modulo k. Therefore, M is generated over Z by one element [1], satisfying the 
nontrivial linear relation k - [1], = 0. Thus, [1], is not a basis for M, just a linear 
generator. The epimorphism (14.1) provided by this generator is nothing but the 
quotient map Z —> Z/(k), n +> [n],, with kernel R = (k), and the representation 
M x K"/R = Z/(k) is a tautology. The linear relation k- [1], = 0 prohibits nonzero 
homomorphisms g : Z/(k) > Z. Indeed, k- g([1]x) = o(k- [1]x) = G(0) = 0 
forces v([1],) = 0, because Z has no zero divisors. Then ¢ ([m]x) = g (m- [1]x) = 
m-@ ([1]x) = 0 for every residue class [m],. 


Exercise 14.6 Show that [n]; linearly generates Z/(k) over Z if and only if 


GcD(n, k) = 1. 


14.1.3 Linear Maps 


Let a K-module M be spanned by vectors w1, W2,..., Wm. Then every K-linear map 
F : M — Nis uniquely determined by its values u; = F(w;) on these generators, 
because every vector v = x; Ww, + X2W2 +:+++2X%mWm € M should be sent to 


F(v) = F (xywy + x2W2 +++ + XWm) = X1Uy + XU. +--+ + Xun. (14.2) 


However, not every set of vectors u1,u2,...,Um € N allows us to construct a K- 
linear homomorphism F : M — N such that F(w;) = u;. For example, we have 
seen in Example 14.2 that there are no nonzero Z-linear maps Z/(k) > Z at all. 


Proposition 14.1 [f the vectors w,,w2,...,Wm span the module M, then for every 
collection of vectors U,,U2,...,Um in a K-module N, there exists at most one 
homomorphism F : M — N such that F(w;) = uj. It exists if and only if for 
every linear relation 4,w, + A.W. +++: +AmWm = O in M, the same relation 
Ayuy +Agqu2 +++ + AmUm = 0 holds in N. In this case, F is well defined by the 
formula (14.2). 
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Proof The collection of vectors 1, u2,...,Um € N is the same as the map 
{€1,€2,...,€m} > N 
from the standard basis of K” to N. By Lemma 14.1, such maps are in bijection 


with the K-linear maps K” -— WN. Under this bijection, the collection u = 
(U1, U2,...,Um) corresponds to the map 


Fry} (X15X25 06+ 5Xm) > XpUy + XQU2 + +++ + XmUm « 


It is factored through the quotient module M = K’"/Ry as 
F, 
K™ —______+ N 
S vA 
M 
if and only if Ry C ker F,,, which means precisely that for all (x1,x2,...,%m) € K”, 


XW + xX2W2 Fees + XmWm =O = XU + XQU2 +++ + XmUm = O. 


If this holds, then F has to be defined by the formula (14.2). oO 


14.1.4 Matrices of Linear Maps 


Let K-modules M and N be spanned by the vectors 
W = (W1,W2,---,Wm) and uw = (uy, U2,...,Un) 


respectively. For every K-linear map F : M — N, write Fyy € Mat, x,(K) for the 
matrix whose jth column is formed by the coefficients of some linear expansion 
of the vector Fw; through the generators u;. Therefore, (F W1,FWo,... , Fwm) = 
(Uy, U2,...,Un) > Fyy. If the generators u; are linearly related, then such a matrix 
is not unique. Thus, the notation F,,, is not quite correct. If the generators w; are 
linearly related, then not every matrix in Mat,,,,(K) is the matrix of the linear map 
F :M — N. Nevertheless, if we are given a well-defined homomorphism of K- 
modules F : M — N, then for every choice of the matrix F,,, and linear expansion 
v=>> wjx; = wW-x, where x = (x1,X2,... Xm)’, the image F(v) certainly can be 
linearly expanded through generators u as F(v) = u- FyyXx. 
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Proposition 14.2 Let a K-module M be generated by the vectors (w,,W2,...,Wm) 
and suppose the linear endomorphism F : V — V sends them to 


(Fw, Fwo, ..., FWy) = (W1,W2,---,Wn) Fy, 
where Fy € Mat,,(K). Then the image of the homothety 
detFy :V—>V,ubr v-det Fy, 


is contained in im F. 


Proof Multiplication by det Fy, acts on the generating vectors by the rule 
(W1,W2,--.5Wm) > (W1,W2,..-,Wm) det Fy - E = (W1,W2,...,Wm) Fy Fy, 


where E is the identity matrix and FY is the adjunct matrix‘ of Fy. Since the columns 
of F,,- Fy are in the linear span of the columns of F,,, multiplication by det Fy, sends 
each w; into im F. oO 


Example 14.3 (Another Proof of the Cayley—Hamilton Identity) Recall that for A € 
Mat,,(K) and f(t) = fo + fit +--+: +fint” € K[x], we write 


f(A) SfoE + fiA + fA? +--+ +fnA™ € Mat,(K) 


for the result of the evaluation® of f at A in Mat,(K). For an arbitrary matrix A € 
Mat, (K), consider the coordinate K-module K”, whose elements will be written in 
columns, and equip it with a K[¢|-module structure by the rule 


f@)- v2 f(A) v =fov + frdv + foA2u +--+ fyA™v . (14.3) 


Exercise 14.7 Verify the axioms from Definition 6.1 on p. 123. 


The standard basis vectors e€1, €2,...,@, of K” span K” over K[f] as well. However, 
over K[r], they are linearly related. In particular, the homothety with coefficient r: 
v + tv has two different matrices in this system of generators: one equals ¢ - E, 
while the other equals A. As a result, the zero map, which takes each vector to zero, 
can be represented by the nonzero matrix tE — A. It follows from Proposition 14.2 
that multiplication by det(tE — A) = y,(t) annihilates K”. In accordance with the 
definition (14.3), multiplication by the polynomial y,(f) considered as a K-linear 
map K" —> K" has matrix 7,(A) in the standard basis of K”. Since K” is free over 
K, each K-linear endomorphism of K” has a unique matrix in the standard basis. We 
conclude that y,4(A) = 0. 


4See Sect. 9.6 on p. 220. 
>See Sect. 8.1.3 on p. 175. 
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14.1.5 Torsion 


Everywhere in this section, we assume that K has no zero divisors. An element m of 
a K-module M is called a torsion element if Am = 0 for some nonzero A € K. 


Exercise 14.8 Verify that the torsion elements form a submodule of M. 


The submodule of all torsion elements in M is denoted by 
TorsM = {meM|I2A40:Am=0} 


and called the torsion submodule. A module M is called torsion-free if Tors M = 0. 
For example, every ideal in K is torsion-free under our assumption that K has no 
zero divisors. Of course, every free K-module is torsion-free. 


Exercise 14.9 Show that if N is torsion-free, then every K-linear map f : M — N 
annihilates Tors(M). 


If TorsM = M, then M is called a torsion module. For example, the quotient ring 
K/TI by a nonzero ideal J C K is a torsion K-module, because A [x]; = [Ax]; = 0 for 
every nonzero A € J. 


14.1.6 Quotient of a Module by an Ideal 


For an ideal J C K and K-module M, we write JM C M for the submodule formed 
by all finite linear combinations of vectors in M with coefficients in I: 


IM © {x,a, + x2a2 + +++ + XnAn |x; ET, a;eM neN}. 


Exercise 14.10 Verify that /M is actually a K-submodule in M. 


The quotient module M/IM has the structure of a module over the quotient ring K/J 
defined by 


[Ali - [whiw = [Awliu . 


where [A]; = A (mod J) and [a];y = a (mod IM). 


Exercise 14.11 Verify the consistency of this multiplication rule. 


14.1.7 Direct Sum Decompositions 


A direct sum (respectively product) of modules is defined as the direct sum (respec- 
tively product) of the underlying abelian groups equipped with componentwise 
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multiplication by scalars exactly as was done for vector spaces in Sect. 6.4.5 on 
p. 141. 


Exercise 14.12 Show that the direct sum of free modules with bases E and F is a 
free module with basis FE U F. 


Associated with any collection of submodules N;,N2,...,N; C M is the K-linear 
addition map 


Ni @®N@-:: CN OM, (U1, U2,...,Us) > Uy + Un +++ + Us. (14.4) 


If it is bijective, then M is said to be the direct sum of the submodules N;. We write 
M = @;N; in this case. Bijectivity of the addition map (14.4) means that every 
vector w € M has a unique decomposition w = uw; + U2 +++: + Us with u; € Nj. 
For example, a free K-module with basis e1, e2,...,é, is the direct sum of n free 
modules K - e; with bases e;: 


K" => Ke, ® Ker ® ere® ® Ke,,. 
Exercise 14.13 Let M = L @ N for submodules L, N C M. Show that M/N ~ L. 


Lemma 14.2 Given two submodules L,N C M, then M = L @ N if and only if 
LUN spans M and LON = 0. 


Proof LUN spans M if and only if the addition map 
o:L@N—-M, (a,b)eat+b, 


is surjective. The kernel of the addition map is zero if and only if LON = 0, because 
(a, b) € kero if and only ifa=—be LON. Oo 


Exercise 14.14 Show that for every direct sum decomposition M = M, ®@ M2. ® 
--» ®M,, and ideal J C K, we have IM = IM, @ IM> ® --- @ IM, and 


Theorem 14.1 Let M be a free module over a commutative ring K with unit. Then 
all bases in M have the same cardinality. 


Proof Choose a maximal ideal° m C K and consider the quotient module M/mM 
as a vector space over the field k = K/m. If a set E C M is a basis of M over K, 
then 


M=@QK-e, mM = Qm-e, and M/mM = Q@k-e. 


ecE ecE ecE 


®See Sect. 5.2.2 on p. 107. 
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Hence, E is a basis of the vector space M/mM over k. Since the vector space 
M/mM does not depend on E, and by Theorem 6.1 on p. 132, all bases of M/mM 
over k have the same cardinality, we conclude that the cardinality of the basis in M 
does not depend on the choice of this basis. Oo 


Definition 14.1 (Rank of a Free Module) The cardinality of a basis of a free 
module M is called the rank of M and is denoted by rk M. 


Remark 14.1 It follows from the proof of Theorem 14.1 that dim, M/mM = rkM. 
Since the right-hand side does not depend on m C K and is well defined by 
Theorem 14.1, dim, (M//mM) is the same for all maximal ideals m C K. 


Definition 14.2 (Decomposability) A module M is called decomposable if there 
exist proper submodules L, N C M such that M = L @ N. Otherwise, M is called 
indecomposable. 


Example 14.4 Let us show that Z is an indecomposable Z-module. The proper 
submodules L C Z are exhausted by the principal ideals L = (d). If there is a 
submodule N C Z such that Z = (d) @N, then N ~ Z/(d) by Exercise 14.13. 
Since Z is torsion-free, there is no inclusion N C Z. Contradiction. 


Exercise 14.15 For M = Z? and the Z-submodule N C M generated by the vectors 
(2,1) and (1,2), show that N ~ Z*, M/N ~ Z/(3), and that there exists no Z- 
submodule L C M such thatM =L@N. 


14.1.8 Semisimplicity 


A module M is called semisimple if for every nonzero proper submodule N Cc M, 
there exists a submodule L C M such that M = L @ N. Every submodule L with 
this property is said to be complementary to N. 


Exercise 14.16 Show that every vector space V over an arbitrary field k is 
semisimple. 


If a commutative ring K is not a field, then there exist nonsemisimple modules. We 
have seen above that the free Z-modules Z and Z? are not semisimple. An example 
of a semisimple Z-module is M = Z/(p) ® Z/(p) ® --- @ Z/(p), where p € N is 
prime. This module is simultaneously a vector space over the field F, = Z/(p), and 
all its Z-submodules are at the same time vector subspaces over F,, and conversely. 
In particular, every vector subspace complementary to N serves as a complementary 
Z-submodule as well. 
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14.2 Invariant Factors 


14.2.1 Submodules of Finitely Generated Free Modules 


Beginning from this point, we assume by default that the ground ring K is a principal 
ideal domain’ and all free K-modules in question are finitely generated. A free 
module of rank zero always means for us the zero module. 


Lemma 14.3 Let M be a free module of rank m < 00 over an arbitrary principal 
ideal domain K. Then every submodule N C M is free, andtkN < rk M. 


Proof By induction on m = rkM. If m = 1, then M ~ K, and the submodule 
N C M is the principal ideal (d) C K. If d = 0, then N = 0 is free of rank 0. 
If d # 0, then (d) = d-K is free of rank 1 with basis* d. Now consider m > 1. 
Let us fix some basis e), €2,...,@m in M and write vectors w € M as the rows of 
coordinates in this basis. Then the first coordinates x,(v) of all vectors v € N form 
an ideal (d) C K. If d = 0, then N is contained in the free module M’ C M with 
basis €2, ... , @m. By induction, N is free with rkN < (m— 1). If d # 0, write 
v; € N for some vector whose first coordinate equals d. Then N = K-v,; @N’, 
where N’ = NM M' consists of the vectors with vanishing first coordinate, because 
(K -v,) ON’ = O and every vector v € N can be decomposed as Av; + w for 
A = x1(v)/d and w = v —Av, € N’. By induction, N’ is free of rank at most m— 1. 
The module Kv is free of rank | with basis v;, because Av, 4 0 for A 4 0. Hence, 
N is free of rank at most m. oO 


Theorem 14.2 (Invariant Factors Theorem) Let K be a principal ideal domain, 
M a free K-module of rank m < co, and N C M a submodule, automatically free of 
rank n < m by Lemma 14.3. Then there exists a basis €,, €2,...,€m in M such that 
appropriate multiples fie,, frer, ... , fren of the first n basis vectors form a basis in 
N and satisfy the divisibility conditions f; | f; for alli < j. The factors f; considered 
up to multiplication by invertible elements do not depend on the choice of such a 
basis. 


Definition 14.3 (Reciprocal Bases and Invariant Factors) Bases e), €2,..., mn 
in M and fie}, foe, ... , fren in N satisfying the conditions of Theorem 14.2 are 
called reciprocal bases of M and N C M. The factors f1, fo,....f, € K considered 
up to multiplication by invertible elements are called the invariant factors of the 
submodule N Cc M. 


We split the proof of Theorem 14.2 into a few steps presented in Sects. 14.2.2- 
14.2.4 below. To begin, we give an equivalent reformulation of Theorem 14.2 in 
terms of matrices. 


7See Sect. 5.3 on p. 109. 
8Note that d is linearly independent, because K has no zero divisors: Ad =0 => 1=0. 
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Theorem 14.3 (Smith Normal Form Theorem) For every rectangular matrix C € 
Matnxx(K), there exists a pair of invertible matrices F € GL,(K), G € GL,(K) 
such that the matrix 


fi 0 


D=FCG=| °: (14.5) 
Tn 


has dy = 0 for alli # j, and all nonzero diagonal elements f; = dj satisfy the 
divisibility conditions f; | fj for all i < j. The nonzero diagonal elements f; of the 
matrix D considered up to multiplication by invertible elements of K do not depend 
on the choice of matrices F, G satisfying (14.5) and the divisibility conditions on fi. 


Definition 14.4 (Smith Normal Form) The diagonal matrix D from Theorem 14.3 
is called the Smith normal form of the matrix C. The nonzero diagonal elements 
Si.f2,---,fn i Smith normal form are called invariant factors of the matrix C. 


14.2.2 Deduction of the Invariant Factors Theorem from 
the Smith Normal Form Theorem 


Let us fix a basis w = (Ww 1,W2,...,Wm) of M and vectors u = (uj, U2,..., Uk) 
spanning NV. We apply Theorem 14.3 to the transition matrix Cy, € Matgx, whose 
jth column is the column coordinates of the vector u; in the basis w. Thus, u = w - 
Cwu-. Let F € GLy,(K), G € GL,(K) fit the conditions of Theorem 14.3 for C = Cyn, 
that is, the matrix D = FC,,G has diagonal form (14.5) and satisfies the required 
divisibility conditions. Since F is invertible, the vectors e = wF! also form a 
basis of M. The vectors e = uG are expressed through e as € = uG = WCy,G = 
eFCy,G = eD. Hence, only the first n vectors in ¢ are nonzero. These nonzero 
vectors ¢; = fje; are linearly independent, because the vectors e; are. They span 
N, because the initial generators u are linearly expressed through ¢ as u = eG™!. 
Therefore, €,&2,...,&, is a basis of N. This proves the existence of reciprocal 
bases. If there are two bases e’ = (e},¢5,...,¢,) ande” = (e/,e4,...,e7) inM 
such that some multiples e; = fie! and e/ = fe! for 1 < i < n form bases in N and 
satisfy the required divisibility conditions, then both diagonal transition matrices 
Coren = Cette Core Corer and Cre = EnCee/Em, where E,, Em are the identity n x n 
and m xX m matrices, fit the conditions of Theorem 14.3 for the same n x m matrix 
C = Cyr. Hence, f/ = f’ up to multiplication by invertible elements. 
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14.2.3 Uniqueness of the Smith Normal Form 


Write Ay(C) € K for the greatest common divisor? of all k x k minors in a 
matrix C. For a diagonal matrix (14.5) such that f; | fj for all j < i, we have 
A,(D) = fifo... fe. Thus, the diagonal elements fe = Ax(D)/Ax—1(D) are 
recovered from the quantities D,(D) uniquely up to multiplication by invertible 
elements. The uniqueness of the Smith normal form of a matrix C follows now 
from the next lemma. 


Lemma 14.4 The numbers A;(C) € K (considered up to multiplication by 
invertible elements of K) are not changed under multiplication of C by an invertible 
matrix from either side. 


Proof Since A;(C) = A;(C'), it is enough to consider left multiplication. Let F = 
AC, where A is invertible. Since the rows of F are K-linear combinations of the rows 
of C, every order-k minor in F is a K-linear combination of order-k minors in C. 


Exercise 14.17 Check this explicitly. 


This forces A;,(C) to divide A; (F). For the same reason, the equality C = A7'F 
forces Ax(F) to divide A;(C). Therefore, A,(C) and A;(F) coincide up to an 
invertible factor. !° o 


14.2.4 Gaussian Elimination over a Principal Ideal Domain 


We will construct matrices F’, G satisfying Theorem 14.3 as products 
F=F,F,-\:::Fi, G=G,:G:::G; 


of some elementary invertible matrices F\, (respectively G,,) such that the transfor- 
mation A +> F,A (respectively A +» AG,) changes just two rows (respectively 
columns) q;, a; in A and leaves all the other rows (respectively columns) of A fixed. 
Rows (respectively columns) a;, a; will be replaced by their linear combinations 
a, = aa; + Baj, qj = ya; + da;. Such a replacement is invertible if and only if 
aé — By = 1. In that case, the elementary matrix F,, (respectively G,,) is invertible 
too. We call such invertible transformations of pairs of rows (respectively columns) 
generalized Gaussian operations." 


Recall that the greatest common divisor of elements in a principal ideal domain is a generator of 
the ideal spanned by those elements. It is unique up to multiplication by invertible elements of K 
(see Sect. 5.3.2 on p. 110). 


OCompare with Exercise 5.17 on p. 111. 
'!Compare with Sect. 8.4 on p. 182. 
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Lemma 14.5 For every pair of elements (p,q) in the same row (respectively 
column) of a matrix A and such that p + q and q + p, there exists a generalized 
Gaussian column operation (respectively row operation) that transforms (p,q) to 
(d,0), where d = GCD(p, q). 


Proof Write d = GCD(p,q) as d = px + qy. Then p = ad, q = bd for some 
a,b € K, and we have the equalities bp = aq and ax + by = 1, which imply that 


x—b\ _ xy\ (P\_ d 
v9-(3 1) = 0 e (")= (6). 
det (* ) - aet(_} ) =ax+by=1, 


To finish the proof of Theorem 14.3, it remains to establish the next proposition. 


and 


Oo 


Proposition 14.3. Every m xn matrix C over an arbitrary principal ideal domain K 
can be transformed to Smith normal form (14.5) by means of generalized Gaussian 
operations on rows and columns. 


Proof After an appropriate permutation of rows and columns, we may assume that 
ci, # O. If all elements of C are divisible by c,,, then the usual elementary row 
and column operations allow us to eliminate the first column and the first row of 
C outside the upper left cell. Induction on the size of C allows us to transform the 
remaining (m — 1) x (n — 1) submatrix to Smith form. During this transformation, 
all matrix elements remain divisible by c;1. 

Now assume that C contains an element a ¢ (cj,). Let d = GCD (a,c). 
We are going to transform C into a matrix C’ such that cj, = d. Note that this 
transformation strictly enlarges the ideal generated by the upper left cell, because 
(ci) FS (a,c) = (chy). If a is in the first row or in the first column, it is enough to 
change the pair (c1,, a) by (d, 0) via Lemma 14.5. If all elements in the first row and 
the first column of C are divisible by c,; and a is strictly to the right and below cy, 
then at the first step, we eliminate the first column and the first row of C outside the 
upper left cell by means of the usual elementary row and column operations. This 
changes a by its sum with some multiple of c);, so the resulting element remains 
indivisible by c,;. We continue to write a for this element. At the second step, we 
add the row containing a to the first row and get a pair (c;,,qa) in the first row. 
Finally, we change this pair by (d, 0) via Lemma 14.5. 

Since K is Noetherian,!? the ideal generated by the upper left cell of C cannot 
be strictly enlarged infinitely many times. Therefore, after a few transformations as 


'2See Definition 5.1 on p. 104. 
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described above, we come to some matrix C all of whose elements are divisible by 
ci1. This case was already considered at the beginning of the proof. oO 


| 
Exercise 14.18 Form a T-shaped table °s by attaching the m x m andn x n 


identity matrices on the right and at the bottom of C. Then transform C to Smith 
normal form by applying the generalized row and column operations to the whole 


F 
T-shaped table. Write G for the resulting table. Show that the matrices D = 


FCG satisfy Theorem 14.3. 


Example 14.5 (Abelian Subgroups in Z”) By Lemma 14.3, every abelian subgroup 
L Cc Z" is a free Z-module. Let rkL = £. By Theorem 14.2, there exists a basis 
Uy, U2,...,Um in Z” such that some multiples m,u,, mzu2, ... , meug of the first £ 
basis vectors form a basis in L. This means, in particular, that for the quotient Z’””/L, 
we have 


Z Z 
TLS eet Be rT (14.6) 
(m1) (me) 


Let us carry out this analysis in detail for the subgroup L C Z? spanned by the 
columns of the matrix 


126 51 72 33 
C=] 301518 9]. (14.7) 
60 30 36 18 


To find reciprocal bases, we have to transform this matrix to Smith normal form. The 
GCD of all matrix elements equals 3. We can get —3 at the (1, 4) cell by subtracting 
the second row multiplied by 4 from the first row. With Exercise 14.18 in mind, we 


perform this operation in the [’-shaped table and get 


6-9 0-31-40 
30 1518 90 10 
60 3036 180 O01 
100 0 
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Then we change the sign in the first row and swap the first and the fourth columns: 


3 9 0-6-140 
91518 30 010 
18 3036 60 001 
000 1 


0 
0 
1 


oor 
oro 
ooo 


Now we can eliminate the first column and the first row of C outside the upper left- 
hand cell by subtracting appropriate multiples of the first row from the other rows 
and then doing the same with the columns: 


3 00 0-1 40 
0-121848 3-110 
0 —24 3696 6-241 
0 001 

0 0 
0 0 
1 2 


Now eliminate the third row of C by subtracting the doubled second row and add 
the third column to the second in order to extract the GCD of the second row: 


3 00 0-1 40 
0 61848 3-110 
0000 0 -21 
0001 
0 10 0 
0 110 
1-3 0 2 


It remains to eliminate the third and the fourth columns in C by adding appropriate 
multiples of the second column: 


re OoCoOoCco Ow 

wre COD © 

OoOnNnNnwnoncono 
— 
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Thus, the Smith normal form D of the matrix C from (14.7) is 


3000 -1 40 ae 
D=|0600]=| 3-110]-c-}, | 0, 
0000 0-21 ie Ge 


This means that L ~ Z? and Z3/L ~ Z/(3) @ Z/(6) @ Z. A basis of L is formed 
by the vectors 


3u; =cqg and 61.=c.+03—3¢4, (14.8) 


where c; means the jth column of the matrix C and v; means the jth column of the 
matrix 


-1 40\. /1140 
F!'={ 3-110} =| 310 
0 -21 eal 


whose columns form a basis of Z? reciprocal to the sublattice L. The right-hand sides 
of formulas (14.8) are the first two columns of the matrix CG, i.e., the reciprocal 
basis vectors of L. 


Exercise 14.19 Show that the following properties of a sublattice L C Z” C Q” 
spanned by the columns of a matrix C € Maty,x,(Z) are equivalent: (a) rk L = m, 
(b) Z” /L is finite, (c) L spans Q” over Q, (d) C as a matrix over Q has rank m. 


Example 14.6 (Commensurable Lattices) An abelian subgroup L C Z” satisfying 
the equivalent conditions from Exercise 14.19 is called commensurable with Z”. If 
L is given as the Z-linear span of columns of some integer rectangular matrix C, 
then L is commensurable with Z’” if and only if rkC = m. Note that this can be 
checked by the standard Gaussian elimination over Q as in Sect. 8.4 on p. 182. 


Proposition 14.4 Let an abelian subgroup L C Z" be spanned by the columns of 
a square matrix C € Mat,(Z). Then L is commensurable with Z” if and only if 
detC # 0. In this case, |Z" /L| = | det Cl, i.e, the cardinality of the quotient Z" /L 
is equal to the Euclidean volume of the parallelepiped spanned by any basis of L. 


Proof Choose a basis 1, u2,..., Uy, in Z” such that some multiples 
MU,, M2u2, ... , Mele 


form a basis in L. We have seen in Sect. 14.2.2 that the n x n diagonal matrix D 
with dj; = m; for 1 <i < € satisfies the matrix equality FCG = D, where F,G € 
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Mat, (Z) are invertible. This means that | det F| = |detG| = 1. Thus | detC| = 
| det D| vanishes if and only if £ < n. If £ =n, then | det C| = mj, - mz --+ my equals 
the cardinality of the quotient Z"/L = @; Z/(mj). oO 


14.3 Elementary Divisors 


14.3.1 Elementary Divisors Versus Invariant Factors 


Associated with every sequence of elements f1,,o,...,fn € K such that f; | f; for all 
i <j is the disjoint union of all positive integer powers of irreducible elements p,! 
appearing in the prime factorizations f; = pj\"pi," --- Dies where 1 < i < n, all 
pij are considered up to multiplication by invertible elements of K, and (pi) F (Dit) 
for each i and j # k. The unordered disjoint union of all these p, is called the 
collection of elementary divisors"? of the sequence of invariant factors fi, fo,....fyr- 
For example, the sequence of integer invariant factors 6, 6, 12, 108 produces the 
following collection of elementary divisors: 2, 2, 27, 27, 3, 3, 3, 3°. 


Lemma 14.6 The correspondence described above establishes a bijection between 
the finite sequences f,, fo, ....f, © K, where each f; is considered up to multiplica- 
tion by invertible elements of K and f; | f; for alli < j, and the finite unordered 
collections of (possibly repeated) positive integer powers p" of irreducible elements 
Dp € K, where two collections are considered equal if they are in one-to-one 
correspondence such that the corresponding powers p" and q’ have 4 = v and 
p = sq for some invertible s € K. 


Proof We have to show that every sequence of invariant factors f|,fo,...,f, 18 
uniquely recovered from the collection of its elementary divisors. For every prime 
p € K, write m(p) and n(p), respectively, for the maximal exponent of p and for 
the total number of (possibly repeated) powers of p represented in the collection 
of divisors. Now let us place the divisors in the cells of an appropriate Young 
diagram as follows. Choose a prime p; with maximal n(p,) and write in a row all 
its powers represented in the collection from left to right in nonincreasing order 
of their exponents. This is the top row of the Young diagram we are constructing. 
Then choose the next prime p2 with the maximal n(p2) among the remaining primes 
and write its powers in nonincreasing order of exponents in the second row of 
the diagram, etc. Since the last invariant factor f,, is divisible by all the previous 
factors, its prime decomposition consists of the maximal powers of all the primes 
represented in the collection of elementary divisors, i.e., f, = II, p”) equals the 
product of elementary divisors in the first column of our Young diagram. Proceeding 


'3Note that the same power p” can appear several times in the collection of elementary divisors if 
it appears in the prime factorizations of several of the fj. 
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by induction, we conclude that the products of elementary divisors taken along the 
columns of the Young diagram form a sequence of invariant factors read from right 
to left. Oo 


Example 14.7 The following collection of integer elementary divisors, 
a 33 23 
2 28373 
TT 4 
5 5 


appears from the following sequence of invariant factors: 
fi=3,f. =3-2, fp=3-2?-7, fa = 3? -23-7-5, fg = 3? --P-S. 
Similarly, looking at the elementary divisors considered before Lemma 14.6, 


a3 23 
2222 


we recover the initial sequence 6 = 3-2,6 = 3-2,12 =3:- 2?, 108 = 33-2? of 
invariant factors. 


Theorem 14.4 (Elementary Divisors Theorem) Every finitely generated module 
M over an arbitrary principal ideal domain K is isomorphic to 


(ee. KT 
(v'') (vp) (pa")” 


where the p, € K are irreducible, m, € N, and repeated summands are allowed. 
Two such modules 


@:--@ (14.9) 


(a) 


are isomorphic if and only if no = mo, a = B, and after appropriate renumbering 
of the summands we get n, = m, and py = s,qy for some invertible s, € K. 


K K 
K” ® —~ ®-:--® —~ and K™® 


n Ne BO 
(v'') (pa") 


K 
(7) 


Definition 14.5 (Elementary Divisors) The decomposition (14.9) for a given 
finitely generated module M is called the canonical decomposition. The collection 
of (possibly repeated) powers p;' appearing in the canonical decomposition is called 
the collection of elementary divisors of the K-module M. By Theorem 14.4, two K- 
modules are isomorphic if and only if they have equal collections of elementary 
divisors and equal free parts K”. 


We split the proof of Theorem 14.4 into several steps presented in Sects. 14.3.2— 
14.3.5 below. 
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14.3.2 Existence of the Canonical Decomposition 


Let the vectors w1,W2,...,Wm span M. Then M = K’"/R, where R C K” is the 
kernel of the K-linear surjection 2, : K’ —» M, which sends the standard basis 
vector e; € K™ to w; € M. By the invariant factors theorem, Theorem 14.2, there 
exists a basis Wy, U2,...,Um in K” such that some multiples fiu1, fous, ... fur of 
the first k = rk R vectors form a basis in R. Hence, 


M = K"/R=K/(fi) ®--® K/(f) ® KK” *. 


mi my Ms 


For each f € {f1,f2,....f;}, consider the prime factorization f = py'p3' ++ p”, 
where all p; are mutually coprime. By the Chinese reminder theorem, '* 


K/(f) = K/ (p{") ® K/ (p3?) ®--- ® K/ (vp). 


This proves the existence of the decomposition (14.9). To establish its uniqueness, 
we describe all the direct summands in intrinsic terms of M. 


14.3.3 Splitting Off Torsion 


The direct sum K/ (pj!) ® --- ® K/ (pz) in the decomposition (14.9) coincides 
with the torsion submodule TorsM = {w € M | 4A 4 0: Aw = O}, and the 
number 7 in (14.9) equals the rank of the free module M/ Tors M. Therefore, they 
do not depend on the particular choice of the decomposition. Note that the existence 
of the decomposition (14.9) implies the freeness of M/ Tors M and the existence of 


a free submodule in M complementary to the torsion submodule. 


Corollary 14.1 Every finitely generated module over a principal ideal domain is 
the direct sum of the torsion submodule and a free module. In particular, every 
finitely generated torsion-free module over a principal ideal domain is free. oO 


14.3.4 Splitting Off p-Torsion 


For each irreducible p € K, we write 


Tors, M# {we M| 4k EN: p‘w=0} 


'4See Problem 5.8 on p. 120. 
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for the submodule formed by all vectors annihilated by some power of p. The 
submodule Tors, M is called the p-torsion submodule of M. For every irreducible 
q € K notassociated!> with p, multiplication by p* : K/ (q") > K/ (q"), x p*x, 
is invertible, because p* is invertible modulo g” for all m,k. Hence, multiplication 
by p* has zero kernel on all summands K/ (q”) with (q) 4 (p). Therefore, the 
p-torsion submodule Tors, M coincides with the direct sum of all components 


K/(p) CEN, 


in the decomposition (14.9). We conclude that this sum does not depend on the 
particular choice of the decomposition (14.9). The existence of the decomposi- 
tion (14.9) leads to the following claim. 


Corollary 14.2 Every finitely generated torsion module over a principal ideal 
domain splits into the direct sum of p-torsion submodules over all irreducible'® 
p © K such that Tors, M # 0. Oo 


Exercise 14.20 Write g, : K/(p”") > K/ (p”), x > p"x, for the multiplication- 
by-p” map. Verify that (a) g, = 0 forn = m, (b) ker gn = im@m—n X K/ (p") for 
0<n<~m, (c) kerg, D kerg,-) and 


forn > m, 


ker g,/ ker Q,-1 ~ 
K/(p) forl<n<m. 


14.3.5 Invariance of p-Torsion Exponents 


To complete the proof of Theorem 14.4, it remains to verify that for each irreducible 
p € K and every K-module M of the form 


M “s @ Qa = h > > > 
= So pee ——, where Vj > V2 2 ++: SVE, 
(p"!) (p"*) 
the Young diagram v = (11, ¥2,..., Vg) is uniquely determined by M and does not 


depend on the particular choice of the above decomposition. Write g; : M —> M, 
v + p'v, for the multiplication-by-p' map. By Exercise 14.20, for each positive 
integer i, the quotient module ker g;/ ker y;_; splits into the direct sum of several 
copies of K/(p). The total number of these copies is equal to the total number of 
rows of length > i in v, that is, to the height of the ith column of v. On the other 


That is, such that (g)  (p), or equivalently, GCD(q, p) = 1 (see Exercise 5.17 on p. 111). 
16Considered up to multiplication by invertible elements. 
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hand, ker y;/ ker g;_, admits a well-defined multiplication by the elements of K/(p), 
i.e., it is a vector space over the field K/(p). 

Exercise 14.21 Verify this. 


Thus, the height of the ith column of the diagram v is equal to 
dimx/p) (ker g;/ ker g_1) 


and therefore does not depend on the choice of decomposition. Hence the diagram 
v is uniquely determined by M. This completes the proof of Theorem 14.4. 


14.4 Description of Finitely Generated Abelian Groups 


14.4.1 Canonical Form of a Finitely Generated Abelian Group 


For K = Z, the elementary divisor theorem, Theorem 14.4, provides a complete 
classification of finitely generated abelian groups. 


Theorem 14.5 Every finitely generated abelian group is isomorphic to a direct 
product of additive groups 


Z Z mae Z 
(r1') ~~ (e2’) (pa") 
where py,n; € N, all py are prime, and repeated summands are allowed. Two such 
groups 


Z' ® ® (14.10) 


Z Z 
Z@— ©: ® 


ae and ——- 
OF) Wy Ce) 


Z’® O° @ 


a 
(v;') 


are isomorphic if and only if r = s, a = B, and after appropriate renumbering of 
summands, n, = my, Py = q forall v. oO 
Definition 14.6 The decomposition (14.10) of an abelian group A is called the 


canonical form of A. 


Exercise 14.22 For which r, p,, n; is the additive abelian group (14.10) cyclic? 
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14.4.2 Abelian Groups Presented by Generators and Relations 


In practice, abelian groups typically are described something like, “the abelian 
group A spanned by the elements aj, a2,... , d, constrained by relations 


fia, + 1242 + ++ + Mindn = 9, 
[e211 + fa2d2 + +++ + fondn = O, 
[43141 + [3242 + +++ + [andy = O, (14.11) 


Mya + My2a2 treet Eman = 0, 


where the j1;; € Z are given integers.” By definition, this means that A = Z”/R, 
where R C Z” is spanned over Z by the rows 41, /12,..., [4m Of the matrix (Wi). The 
canonical form (14.10) of such a group has r = n—rk (ui). where rk means the rank 
of the matrix over Q, and the elementary divisors p;" are associated with the invariant 
factors of the submodule R C Z", as explained at the very beginning of Sect. 14.3. 

To verify whether a given element w = x;a; + X242 +-+:+2X,d, € A is zero, 
or more generally, to compute the order!’ ord(w) in A, Gaussian elimination over Q 
can be used as follows. Solve the linear system w = x1 Jl) + X2{l2 + +++ + Xm[bm in 
x € Q”. If the system is inconsistent, then w is outside the Q-linear span of the rows 
of the matrix (ui). In this case, no integer multiple mw belongs to R, that is, w 4 0 
in A and ordw = oo. If w = x14 + X22 +--+ + Xmlm for some x; = pi/qi € Q 
such that GCD(p;, gi) = 1, then ord(w) = LCM (q1, g2,-.-., 4m). In particular, w = 0 
inA = Z"/R if and only if all the g; are equal to 1, that is, the system admits an 
integer solution x € Z”. 


Problems for Independent Solution to Chap. 14 


Problem 14.1 (Noetherian Modules) Check that the following properties of a 
module M over an arbitrary commutative ring with unit are equivalent'*: 


(a) Every subset X C M contains some finite subset F C X with the same linear 
span as X. 

(b) Every submodule N C M is finitely generated. 

(c) For every infinite set of increasing submodules Nj; C No C Nz C -:-- there 
exists n € N such that N, = N, for all v > n. 


"Recall that in an additive abelian group, the order of an element w is the minimal n € N such 
that nw = 0. If there is no such n, we set ord(w) = 00 (see Sect. 3.6.1 on p. 62). 


'8Modules possessing these properties are called Noetherian (compare with Lemma 5.1 on p. 104). 
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Problem 14.2 Show that a module over a Noetherian ring'? is Noetherian if and 
only if it is finitely generated. 


Problem 14.3 Show that all submodules and quotient modules of a finitely gener- 
ated module over a Noetherian ring are finitely generated too. 


Problem 14.4 Show that for every module M over an arbitrary commutative ring 
with unit and every submodule N C M such that the quotient module L = M/N 
is free, we have M~ NOL. 


Problem 14.5 Prove that Hom(@y.-1 Mi, Bo=i N,) = ©, Hom(My, Ny). 


Problem 14.6 Every Z-module generated by a single vector is called cyclic. Show 
that: 


(a) Every cyclic Z-module is isomorphic to either Z or Z/(n). 
(b) Z/(n) @ Z/(m) is cyclic if and only if GCD(m, n) = 1. 


Problem 14.7 How many decompositions into a direct sum of two cyclic subgroups 
does the abelian group Z/(5) @ Z/(5) admit? 


Problem 14.8 How many subgroups of order (a) 2, (b) 6, are there in the noncyclic 
abelian group of order 12? 


Problem 14.9 Show that for every abelian group and a finite set of its finite 
subgroups of mutually coprime orders, the sum of these subgroups is a direct 
sum. 


Problem 14.10 Given a vector w = (n,2,...,%m) € Z", show that a Z-submodule 
Lc Z" such that Z” = L @ Z- w exists if and only if GCD (7, 2,...,m) = 1. 
Find such a complementary submodule L C Z* for w = (2,3, 4, 5). 


Problem 14.11 Is there in Z? a submodule complementary to the Z-linear span of 
the vectors (a) (1, 2,3) and (4, 5, 6) ? (b) (1, 2, 2) and (4, 4, 6)? 

Problem 14.12 Write N C Z[x] for the Z-submodule formed by all polynomials 
with even constant term. Is it true that N (a) is finitely generated? (b) is free? 
(c) has a complementary submodule? 


Problem 14.13 Enumerate the canonical forms”? of all abelian groups that are 
semisimple”! as Z-modules. 


Problem 14.14 Show that for every finitely generated free Z-module L and every 
submodule N C L such that L/N is finite, both Z-modules L’ = Homz(L, Z) , 
N’ = {g € Homz(L, Q) | g(N) C Z} are free and N’/L’ ~ L/N. 


See Sect. 5.1.2 on p. 104. 
See 14.10 on p. 355. 
“See Sect. 14.1.8 on p. 343. 
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Problem 14.15 Are the additive groups Z/(6) @ Z/(36) and Z/(12) ® Z/(18) 
isomorphic? 

Problem 14.16 Write down the complete list of canonical forms”? of all abelian 
groups of order 4, 6, 8, 12, 16, 24, 48, 36. 

Problem 14.17 How many abelian groups of order 10000 are there up to isomor- 
phism? 

Problem 14.18 Is there in the additive abelian group Z/(2) ® Z/(16) a subgroup 
isomorphic to (a) Z/(2)®Z/(8)? (b) Z/(4) BZ/(4)? (ec) Z/(2) BZ / (2) BZ/ (2)? 

Problem 14.19 Write the following abelian groups in canonical form: 


(a) Homz(Z/(6),Z/(12)), 
(b) Homz(Z/(4),Z/(8)), 
(c) Homz(Z/(2) ® Z/(2),Z/(8)). 


Problem 14.20 Show that Homz(Z/(m), Z/(n)) ~ Z/(m, n). 
Problem 14.21 Show that for every field k, 


(a) Homyyy (k[x]/(f). kP1/(g)) ~ kbd/(F, 8). 
(a) every k[f]-linear map G : k[t]/(f) — k[#]/(/) is a multiplication map [h] 
[gh] by the residue class [g] = G((1]). 


Problem 14.22 Write in canonical form the quotient group of Z? by the subgroup 
spanned by: 


(a) (2,—4, 6), (6, —6, 10), (2,5, 8), (6,0, 5), 

(b) (4,5, 3), (5, 6,5), (8, 7, 9), 

(c) (—62, —8, —26), (40, 10, 16), (22, —8, 10), (20, 2, 8), 
(d) (7,2, 3), (21,8, 9), (5, —4, 3), 

(e) (—81, —6, —33), (60, 6, 24), (—3, 6, —3), (18, 6, 6). 


Problem 14.23 In the abelian group generated by elements aj, az, a3 constrained 
by the relations (a) a; + a2 + 4a3 = 2a; — ay + 2a3 = 0, (b) 2a; + az — 50a3 = 
4a, + 5a. + 60a3 = O, find the orders of the elements a, + 2a3 and 32a, + 31a3. 

Problem 14.24 Find all integer solutions of the following systems of equations: 

Xy + 2x2 + 3x3 + 4x4 = 0, 2x, + 2x2 + 3x3 + 3x4 = 0, 


4x, + 4x2 + 5x3 + 5x4 = 0, 4x, + 4x2 + 5x3 + 5x4 = 0. 
Problem 14.25 Write a and b for the residue classes of the unit 1 € Z in the additive 
groups Z/(9) and Z/(27) respectively. Find the canonical form of the quotient 
group of Z/(9) @ Z/(27) by the cyclic subgroup spanned by 3a + 9b. 


2See Definition 14.6 on p. 355. 
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Problem 14.26 Let A be a finite abelian group. Show that for every m € N dividing 
|A|, there is a subgroup of order m in A. 


Problem 14.27 Let some finite abelian groups A, B have the same number of 
elements of order m for all m € N. Prove’ that A ~ B. 


Problem 14.28 Show that for every finitely generated modules A, B, C over a 
principal ideal domain, the isomorphism A @ C ~ B@C implies the isomorphism 
AB. 

Problem 14.29 Show that for every integer n > 3, the multiplicative group of 
invertible elements in the residue class ring Z/ (2”) is isomorphic to the additive 
group Z/(2) ® Z/ (2"-’). 

Problem 14.30 (Integer-valued Polynomials) Write 


M®= ff € QhX] | Vm e Zf(m) € Z} 


for the Z-module of integer-valued polynomials with rational coefficients and let 
Mz, C M be the submodule formed by the polynomials of degree at most d. Verify 
that the polynomials yp © 1 and 


n= (‘s") = Get DEED OHA), iZped: 


form a basis of Mz over Z and find the cardinality of the quotient module 
Ma/(Ma 0 Zi). 


Show that for every Z-linear map F : M — M commuting with the shift f(x) b& 
f(x + 1), there exists a unique power series f € Z[7] such that F = f(V), where 
V = f(x) > f@) —f@— 1). 

Problem 14.31 (Rank of a Matrix over a Principal Ideal Domain) Let K be an 
arbitrary principal ideal domain. Show that for every matrix M € Mat,,x,,(K), the 
submodules generated within K” and K" by the rows and columns of M are free 
of the same rank equal to the number of nonzero elements in the Smith normal 
form of M. 


Problem 14.32 (Pick’s Formula for Parallelepipeds) For a triple of integer 
vectors U1, U2, v3 € Z C R? linearly independent over R, write L C Z and 
TI c R? for the Z-submodule and the real parallelepiped spanned by these 


3Compare with Problem 12.11 on p. 304. 
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vectors. Show that the Euclidean volume of ITI equals the cardinality of the 
quotient group Z?/L and that it can be computed as v + f/2 + e/4 + 1, where 
v, f, e denote the numbers of internal integer points™* within TT, faces of I, and 
edges of IT respectively. Extend both claims to higher dimensions. 


4-57 
Problem 14.33 Is there a matrix Z € Mat;(Z) such thatZ? = | 1 —49]? 
—-4 05 


That is, points with integer coordinates lying inside but not on the boundary of TT, faces of TT, 
and edges of IT respectively. 


Chapter 15 
Linear Operators 


15.1 Classification of Operators 


15.1.1 Spaces with Operators 


Let k be an arbitrary field, V a finite-dimensional vector space over k, and F : V> V 
a linear endomorphism of V over k. We call a pair (F', V) a space with operator or 
just an operator over k. Given two spaces with operators (F\, U;) and (F2, U2), a 
linear map C : U; — Uy? is called a homomorphism of spaces with operators if 
Fy °C = Ce F\, or equivalently, if the diagram of linear maps 


U, —> 0, 


A [ns 


is commutative.! If C is an isomorphism of vector spaces, the operators F, and F> 


are called similar or isomorphic. In this case, Fy = CF; C—!, and we also say that 
Fy is conjugate to F\ by C. 


15.1.2. Invariant Subspaces and Decomposability 


Let (F, V) be a space with operator. A subspace U C V is called F-invariant if 
F(U) C U. In this case, (Fy, U) is also a space with operator, and the inclusion 
U <> V is a homomorphism of spaces with operators. If F has no invariant 


1See Problem 7.7 on p. 168. 
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subspaces except for zero and the whole space, we say that F is irreducible or 
simple. 


Exercise 15.1 Check that multiplication by a residue class [t] in the quotient ring 
R[1]/ (? + 1) is irreducible over R. 


An operator F : V — V is called decomposable if V = U © W, where both 
subspaces U, W C V are nonzero and F-invariant. Otherwise, F is called indecom- 
posable. For example, all irreducible operators are indecomposable. Every space 
with operator clearly can be split into a direct sum of indecomposable invariant 
subspaces. The next exercise shows that over any field k, there are indecomposable 
operators of every dimension, and not all such operators are irreducible. 


Exercise 15.2 Check that multiplication by a residue class [f] in the quotient ring 
k[d]/ (#”) is indecomposable for all n and is irreducible if and only if n = 1. 


Exercise 15.3 Show that the dual operators? F : V — V and F* : V* — V* are 
either both decomposable or both indecomposable. 


15.1.3 Space with Operator as a k|t]-Module 


Every linear operator F : V — V provides V with the structure of a k[f]-module in 
which multiplication of a vector v € V by a polynomial 


SO =ao t+ aytt+---+ ant” € k{d 


is defined by f-v {f(F)v = agv + aj Fv + anF?v + +++ +a,F"v. We denote this 
k[#]-module by Vy and say that it is associated with F. Every k[t]-module structure 
on V is associated with a linear operator tf : V — V, v +> t- v, provided by the 
multiplication of vectors by ¢ in this structure. Therefore, a space with operator and 
a k[t]-module of finite dimension over k are the same thing. 

A homomorphism C : Vr — Wg of k[t]-modules associated with operators 
F:V— VandG: W —> Wis nothing but a k-linear map C : V > W commuting 
with the multiplication of vectors by 1, i.e., such that C ° F = Ge C. In particular, 
k[¢]-modules Vr and Wg are isomorphic if and only if the operators F and G are 
similar. A vector subspace U C V is a k[f]-submodule of the k[f]-module V; if and 
only if multiplication by ¢ takes U to itself, that is, exactly when U is F-invariant. 
The decomposition of V into a direct sum of F-invariant subspaces means the same 
as the decomposition of the k[t]-module Vy into a direct sum of k[#]-submodules. In 
other words, the structure theory of operators is identical to the structure theory of 
k[t]-modules finite-dimensional over k. 


>See Sect. 7.3 on p. 164. 
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Theorem 15.1 Each linear operator on a finite-dimensional vector space over an 
arbitrary field \k is similar to multiplication by t in a direct sum of residue modules 


k(7] k(7] 
™m| OB-- @ mk\ ? 
(Pi ) (px ) 


where all p, € k[t| are monic irreducible polynomials, and repeated summands are 


(15.1) 


allowed. Two multiplication-by-t operators acting on the direct sums 


k[¢] k[¢] k[¢] k[¢] 
Tm) 8° © Tomy Ty Tr 
(p;") (pi) (q") (a/') 


are similar if and only if k = £ and after appropriate renumbering we get py = qv, 
my, =n, for all v. All direct summands in (15.1) are indecomposable. 


Proof By the elementary divisors theorem, Theorem 14.4 on p. 352, the k[f]-module 
Vr associated with a arbitrary operator F : V — V is a direct sum of a free module 
k[¢” and a torsion module of the form (15.1). Since k[¢]” is infinite-dimensional as 
a vector space over k for m > 0, the free component of Vr vanishes. Now the first 
two statements of Theorem 15.1 follow immediately from Theorem 14.4. The last 
statement is also a part of Theorem 14.4, because the decomposability of the k[¢- 
module k[t]/ (p”), where p € k[¢] is a monic irreducible polynomial, contradicts 
the uniqueness of the decomposition (15.1) written for k[t]/ (p”) itself. Oo 


Corollary 15.1. Each indecomposable operator is similar to multiplication by t 
in the residue module k[t|/ (p’), where p € k[t] is monic irreducible. Such an 
operator is irreducible if and only ifm = 1. Two indecomposable operators acting 
on k[t|/ (p”) and k[t|/ (q") are similar if and only if p = q andm = n. 


Proof The first and third statements are contained in Theorem 15.1. Let us prove 
the second. Every k[t]-submodule in k[f]/ (p”) is an ideal of this quotient ring. For 
m = 1, the residue ring k[#]/ (p) is a field’ and therefore has no nontrivial ideals.* 
For m = 2, the principal ideal spanned by the class [p] = p (mod p”) is proper. O 


15.1.4 Elementary Divisors 


The unordered disjoint collection of all polynomials p{” appearing in the decom- 
position (15.1) is called the collection of elementary divisors of the operator 
F : V — V and is denoted by &(F). Note that it may contain several copies 
of the same polynomial p”, which occurs in €€(F) as many times as there are 


3See Proposition 3.8 on p. 53. 
4See Proposition 5.1 on p. 103. 


364 15 Linear Operators 


summands k[¢]/ (p”) in the decomposition (15.1). A straightforward conclusion 
from Theorem 15.1 is the following corollary. 


Corollary 15.2 Two linear operators F and G are similar if and only if E€(F) = 
EL(G). Oo 


Exercise 15.4 Let a space with operator F split into the direct sum of F-invariant 
subspaces U; on which F is restricted to operators F; : U; > U;. Show that &(F) = 


|| &(Fi). 


Practical computation of elementary divisors for a given operator F : V — V can be 
done as follows. Choose a basis v = (v1, v2,..., Un) in V and write F, € Mat,(k) for 
the matrix of F in this basis. We will see instantly that the collection of elementary 
divisors €¢(F) is associated with a sequence of invariant factors> Si, f2,--->fn of the 
matrix tE — F, € Mat,(k[¢]) over k[s]. Therefore, to get E2(F), we first compute the 
invariant factors f; of the matrix tE — F, either as diagonal elements of its Smith 
form® or by the formula f; = A; (tE — Fy) /Ax—1(tE — F,), where Ax means the GCD 
of all k x k minors, then realize the irreducible factorization f; = pj"p;;° --- pa 
of each factor, then collect disjointly all factors p” appearing in these factorizations. 
The coincidence of €¢(F) with the collection of elementary divisors of the matrix 
tE—F, € Mat,(k[f]) follows immediately from the following description of the k[¢]- 
module Vy by generators and relations. The basis vectors v;, which span Vr even 
over k, certainly generate Vr over k[t]. Therefore, we have an epimorphism of k[f]- 
modules z, : k[r]” — Vr sending the standard basis vector e; € k[t]” to v;. Hence, 
Vr x k[d"/kerz. 


Lemma 15.1 Write the elements of k[t|" as coordinate columns with elements in 
k[d]. Then the relation submodule ker 1, C k{t]" is spanned over k{t] by the columns 
of the matrix tE — Fy. 


Proof Let Fy = hs Then the jth column of rE — F, is expressed through the 
standard basis e as te; — fije) — foje2 — «++ —fnjén. When we apply zy to this vector, 
we get 


i=1 i=1 i=1 
Thus, all columns of tE — F, lie in ker z,. Every vector h € k[¢]” can be written as 


h= "hm = ee a th =F ho , (15.2) 


>See Sect. 14.3 on p. 351. 
See Definition 14.4 on p. 345. 
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where the coefficients h; € k” are columns of scalar constants. The same long- 
division process’ as for ordinary polynomials with constant coefficients allows us 
to divide the polynomial (15.2) from the left side by the polynomial tF — F,, with 
remainder of degree zero in f, i.e., 


"lm et th = ho = (tE — F,) : (0! gm—1 Sp Se Se tg] + g0) aa (15.3) 


for appropriate g;,r € k”. 
Exercise 15.5 Convince yourself of this. 


It follows from (15.3) that every vector h € k[t]" can be written ash = (tE—F,)g + 
r for appropriate g € k[¢” and r € k”. In other words, each column in k[f]” is 
congruent modulo the columns of the matrix tE — F,, to a column of constants, that 
is, to some linear combination )* A;e;, where 4; € k. In particular, 2,(h) = >> A;0;. 
If h € ker(z,), then the latter sum is zero. This forces all A; to equal 0, because the 
vu; form a basis in V. Hence, all h € ker z, lie in the linear span of the columns of 
the matrix tE — Fy. oO 


Exercise 15.6 Show that dual operators® F : V > V and F* : V* > V* are similar 
over every field. (Equivalently, every square matrix A is conjugate to the transposed 
matrix A’.) 


15.1.5 Minimal Polynomial 


For a monic irreducible polynomial p € k[f], we write m,(F) for the maximal integer 
m such that p” € €¢(F). Therefore, m,(F) = 0 for all p except those appearing 
in (15.1). We call m,(F) the index of p in E€(F). It follows from Theorem 15.1 that 
the minimal polynomial of F is equal to 


ur(t) =] [p", 


Pp 
which is the least common multiple of all elementary divisors. 


Corollary 15.3 A polynomial f € k{t] annihilates an operator F : V — V if and 
only if f is divisible by each elementary divisor of F. oO 


TSee Sect. 3.2 on p. 46. 
8See Sect. 7.3 on p. 164. 


°That is, the monic polynomial ju(t) of lowest positive degree such that w-(F) = 0 (see 
Sect. 8.1.3 on p. 175). 
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Let F, be the matrix of the operator F : V — V in some basis v of V. The 
characteristic polynomial of this matrix yp-,(t) = det(tE — F,) does not depend 
on the choice of basis, because in another basis w = v C, we get!? Fy = C7!F,C 
and 


det (tE — F,) = det (tE — C7'F,C) = det(C™! (t£ — F,) C) 
= detC! - det (tE — F,)- detC = det (tE — F,) . 


For this reason, y(t) @ det(tE — F,) is called the characteristic polynomial of 
the operator F. The previous computation shows that similar operators have equal 
characteristic polynomials. 


Exercise 15.7 Let an operator F be decomposable as the direct sum of operators G, 
H. Check that y-(1) = yo(t)- xx (1). 


Exercise 15.8 For a monic polynomial f € k[f], check that the characteristic 
polynomial of the multiplication-by-t operator in the residue class module k[#]/(f) 
equals f. 


These exercises together with Theorem 15.1 lead to the following result. 


Corollary 15.4 The characteristic polynomial of an operator F is the product of 
all the elementary divisors of F. oO 


Exercise 15.9 Use Corollary 15.4 to give a new proof of the Cayley—Hamilton 
identity.|! 


Proposition 15.1 Every operator F over a field R has an invariant subspace of 
dimension < 2. 


Proof Let the characteristic polynomial of F be factored as yr = qi: G2 °°: Im, 
where g; € RJt] are monic irreducible polynomials, not necessarily distinct. 
Note that degg; < 2 for each i. If we apply the zero operator0 = yr(F) = 
qi(F) ° qo(F) ° +--+ © Gm(F) to any nonzero vector v € V, then for some i > 0, we 
obtain a nonzero vector w = qi41(F)° +--+ ° dm(F)v ¥ 0 such that q,(F) w = 0. 
If g;(t) = t — A is linear, then F(w) = Aw, and we get a 1-dimensional invariant 
subspace k - w. If g(t) = t? — at — B is quadratic, then F(Fw) = aF(w) + Bw 
remains within the linear span of w, Fw, and we get at most a 2-dimensional F- 
invariant subspace generated by w and Fw. oO 


0See formula (8.13) on p. 182. 
Which says that y-(F) = 0. 
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Corollary 15.5 (From the Proof of Proposition 15.1) Every linear operator over 
an arbitrary algebraically closed field has an invariant subspace of dimension one. 
oO 


15.2 Operators of Special Types 


In this section we discuss some particular types of operators: nilpotent, semisimple, 
cyclic, and diagonalizable. All these types play important special roles in many 
branches of mathematics as well as in applications. We characterize operators of 
each type in two ways: as k[¢]-modules and in intrinsic terms concerned with their 
action on V. 


15.2.1 Nilpotent Operators 


An operator F : V > V is called nilpotent if F” = 0 for some m € N. Let F be 
nilpotent. Since F is annihilated by a polynomial 7”, all elementary divisors of F 
have the form ?¢”, i.e., F is similar to multiplication by ¢ in the direct sum of residue 
class modules 


k[¢] eS ES k[d] (15.4) 
(@) (t)” , 
Let us number the exponents in nonincreasing order vj) > vo > +++ FS vy. 


By Theorem 15.1, this nonincreasing sequence of positive integers determines F 

uniquely up to conjugation. In other words, the conjugation classes of nilpotent 

operators over a field k are in bijection with the Young diagrams. The Young 

diagram v(F) corresponding to a nilpotent operator F is called the cyclic type of F. 
Write [f] for the residue class f (mod ?’”) and choose 


ey= "1, n= (eo): cee Cm = [1] 


as a basis in k[f]/ (¢’”). Multiplication by ¢ acts as 


Oey Hep 463 4 tt FH en-1 A Em, 
that is, has matrix 
01 0.---0 
00 1 
ImO) EF Piro 
0 1 
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This matrix is called a nilpotent Jordan block of size m. Thus, a nilpotent operator F 
of cyclic type v admits a basis whose vectors can be written in the cells of a diagram 
v in such a way that F annihilates the leftmost vector of each row and sends every 
other vector to its left neighbor, as in the following example: 


OHeHececieciecie 
Ocecsieciesciecie 

<w> OHoedece (15.5) 
Ocaecsie<cie 
Ocaecie 


A basis of this type is called a Jordan basis.'* The elements of such a basis are 
combined in sequences along the rows of the Young diagram. These sequences are 
called Jordan chains. 

The cyclic type of F can be determined in terms of the action of F on V as 
follows. The total number of cells in the leftmost m columns of the diagram v(F) is 
equal to dim ker F”. Thus, the mth column of the diagram v(F) has length’? 


v! (F) = dim ker F” — dimker F”! . (15.6) 


15.2.2. Semisimple Operators 


An operator F : V > V is called semisimple’ if V is a direct sum of simple (or 
irreducible) spaces with operators. Semisimplicity is somehow “complementary” to 
nilpotency. It has several equivalent characterizations. 


Proposition 15.2 The following properties of an operator F : V — V are 
equivalent: 


(1) V is a direct sum of irreducible F-invariant subspaces. 

(2) V is linearly generated by irreducible F-invariant subspaces. 

(3) For every proper F-invariant subspace U C V, there exists an F-invariant 
subspace W C V such that V = U @ W. 

(4) F is similar to multiplication by t in the direct sum of residue modules 


k[]/(p1) @ k[4]/(p2) ® --- @ kl/(pr), 


where p; € k[¢| are monic irreducible, and repeated summands are allowed (in 
other words, all exponents m; = | in formula (15.1) on p. 363). 


Or cyclic basis. 


Recall that we write v; for the length of the ith row in the Young diagram v and write v’ for the 


transposed Young diagram. Thus, v/, means the length of the mth column of v. 


'4Or completely reducible. 
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Proof The implication (1) = (2) is obvious. Let us prove the implication (2) > 
(3). For every irreducible F-invariant subspace L C V, the intersection LM U is 
either zero or L, because LM U C Lis F-invariant. If U contains all F-invariant 
irreducible subspaces of V, then U = V by (2). Since U is proper, there exists a 
nonzero irreducible F-invariant subspace L C V such that LN U = 0. If U@L = V, 
we are done. If not, we replace U by U’ = U @ L and repeat the argument. Since 
dim U’ > dim U, after a finite number of steps we will have constructed a sequence 
of nonzero irreducible F-invariant subspaces Lj, L2,...,L, such that L; = L, 
LA(UOL,OlLn@--- @Li-1) = 0 for 1 < i< k, and U@eL, ® Lo @ eee ® Ly =V. 
Then W = L; ® Lo @ --- @ Ly is what we need. 

To verify the implication (3) = (4), let us show first that if (3) holds in V, then (3) 
holds in every F-invariant subspace H C V. Consider a proper F-invariant subspace 
U C H. There exist F-invariant subspaces Q,R C V such that V = H@®Q = 
U® QOR. Write x : V —> H for the projection along Q and put W = z(R). 


Exercise 15.10 Check that H = U @ W. 


Thus, if (3) is true for a direct sum (15.1), then (3) holds within each direct sum- 
mand. By Corollary 15.1, all summands k[#|/(p”) are indecomposable. However, 
if m > 1, then k[t]/(p”) contains some proper invariant subspaces. Hence, if (3) 
holds in V, then all exponents m; are equal to | in the decomposition (15.1). The 
implication (4) => (1) follows immediately from Corollary 15.1. oO 


Corollary 15.6 (From the Proof of Proposition 15.2) The restriction of a 
semisimple operator F onto any F-invariant subspace is semisimple. 


Example 15.1 (Euclidean Isometries) Let us show that every orthogonal operator!> 
F : V > V ona Euclidean vector space V is semisimple. For every proper F- 
invariant subspace U C V, there is an orthogonal decomposition! V = U @ Ut. 
Let us verify that the orthogonal complement U+ is F-invariant. Since the restriction 
F\y is a Euclidean isometry of U, it is bijective, and therefore for each u € U, 
there is u’ € U such that u = Fu’. Hence for each w € Ut and u € U, we have 
(Fw,u) = (Fw, Fu’) = (w,uw’) = 0, that is, Fw € Ut, as required. We know 
from Proposition 15.1 on p. 366 that every operator on a real vector space F has a 
1- or 2-dimensional invariant subspace. For every semisimple operator F, such an 
invariant space splits off as a direct summand. The previous argument shows that 
for a Euclidean isometry, it is an orthogonal direct summand. We conclude that a 
Euclidean vector space with isometry F splits into an orthogonal direct sum of 1- 
dimensional subspaces, where F acts as +Id, and 2-dimensional subspaces, where 
F acts as rotation. 


Exercise 15.11 Deduce the last statement from Example 10.10 and Example 10.11 
on p. 245. 


See Sect. 10.5 on p. 244. 
16See Sect. 10.1 on p. 237. 
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Theorem 15.2 (Normal Form of a Euclidean Isometry) For every Euclidean 
isometry F : V — V, there exists an orthonormal basis of V in which F has a 
block-diagonal matrix formed by one-cell blocks +1 and 2 x 2 blocks 


ar 2) , where0 <p <n, 

sing cos 

representing counterclockwise rotation about the origin by the angle p € (0,2). 
Up to a permutation of blocks, this block-diagonal matrix does not depend on the 
choice of the orthonormal basis in question. 


Proof The existence was explained in Example 15.1. To see uniqueness, note that 
each 1-dimensional isometry +Id contributes an elementary divisor t ¥ 1 to E€(F), 
and each rotation of the plane by the angle g contributes an elementary divisor 
? —2cosy-t+ 1 to €(F). Thus, the blocks are in bijection with the elementary 
divisors of F. Oo 


15.2.3 Cyclic Vectors and Cyclic Operators 


A vector uv € V is called a cyclic vector of the linear operator F : V — V if its F- 
orbit v, Fu, F*v, F?v,, ... spans V over k, or equivalently, if v generates Vr over 
k[d]. Operators possessing cyclic vectors are called cyclic. 


Proposition 15.3. The following properties of an operator F : V — V are 

equivalent: 

(1) F is cyclic. 

(2) F is similar to multiplication by t in the residue class module k{t|/(f), where 
Ff € kf] is an arbitrary monic polynomial. 

(3) Each monic irreducible p € k{t| appears in & F at most once. 

(4) The minimal and characteristic polynomials of F coincide. 


Proof The equivalence (3) <=> (4) follows at once from Corollary 15.4. Both 
conditions mean that F is similar to multiplication by ¢ in the direct sum of residue 
class modules 


k[¢]/ (p1") ® k[A/ (p3") ® --- @kld]/ (p””) 


where p1, P2,...,p, are monic, irreducible, and distinct. By the Chinese remainder 
theorem, this sum is isomorphic to the residue class module k[f]/(f), where 


f=xr=pr=| [pi". 
i=1 
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Thus, (2) is equivalent to (3), (4). The implication (2) = (1) is obvious: the class 
of the unit [1] is a cyclic vector for multiplication by t in k[¢]/(f). Conversely, if Vr 
is generated by v over k[¢], then a k[f]-linear surjection 2 : k[t] —> Vr is given by 
1 + v. Hence, Vr = k[t]/ ker z. Since k[f] is a principal ideal domain, the ideal 
ker z C k[¢] has the form (f) for some monic f € k[f]. Thus, (1) => (2). Oo 


Exercise 15.12 Show that multiplication-by-t operators in k[¢]/(f) and k[t]/(g) 
(where f, g both are monic) are similar if and only if f = g. 


15.2.4 Eigenvectors and Eigenvalues 


A nonzero vector v € V is called an eigenvector of an operator F : V —> V if 
F(v) = Av for some A € k. In this case, A is called an eigenvalue of F. The set of 
all eigenvalues is called the spectrum of F in k and is denoted by 


Spec, (F) = {A ek|dvue V0: Fu=Av} 


(we will omit the index k when it is not important). Note that Spec, (F) is a subset 
of k, and each A € Spec(F) is counted there just once (in contrast with &(F), where 
we allow multiple entries). 


Proposition 15.4 Spec;(F) coincides with the set of roots'’ of the characteristic 
polynomial y(t) = det(tId — F) ink. In particular, | Spec(F)| < dim V. 


Proof An eigenvector v with eigenvalue A lies in the kernel of the linear oper- 
ator Ald — F. By Corollary Exercise 9.12, ker(AE — F) # 0 if and only if 
det(AId — F) = 0. Oo 


Corollary 15.7 [fk is algebraically closed, then Spec, (F) # @ for every operator 
F. In particular, every operator has an eigenvector.'® oO 


Exercise 15.13 Prove that over every field k the spectrum Spec, F is contained in 
the set of roots of every polynomial f € k[¢] such that f(F’) = 0. 


Exercise 15.14 Prove that over an algebraically closed field k, an operator F is 
nilpotent if and only if Spec, F = {0}. 


Proposition 15.5 Every set of eigenvectors with distinct eigenvalues is linearly 
independent. 


Counted without multiplicities. 
'8Note that this agrees with Corollary 15.5 on p. 367. 
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Proof Assume the contrary and let x1 v1 + x2V2 +++: + x,v~% = 0 be a linear rela- 
tion of minimal length k. Write A,,A2,...,A, for the eigenvalues of v1, v2,..., Um 
respectively. If we apply the operator to the previous relation, then we get another 
relation, xjA; Vv, +.x2A202 + +++ +x, Axv, = 0. Multiply the first relation by A, and 
subtract from the second. We get a strictly shorter relation 


O = x1 (A, — Ag) vy + .x2(Az2 — Ag) v2 Foe HRA (An-1 — Ag) UK-1 


with nonzero coefficients. Contradiction. oO 


15.2.5 Eigenspaces 


For each A € k, a subspace V, & {v € V | F(v) = Av} = ker(A Idy — F) is called 
a A-eigenspace of F. It is nonzero if and only if A € Spec(F), or equivalently, 
xr(A) = 0. Each eigenspace V, is F-invariant, and Fly, = A - Idy,. From 
Proposition 15.5, we get the following corollary. 


Corollary 15.8 A sum of nonzero eigenspaces is a direct sum, and 
Vrespecr dim V, < dimV. Oo 


Exercise 15.15 Give an example of a linear operator for which the inequality is 
strict. 


15.2.6 Diagonalizable Operators 


A linear operator F : V — V is called diagonalizable if it has a diagonal matrix 
in some basis of V. Such a basis consists of eigenvectors of F, and the elements 
of the diagonal matrix are the eigenvalues of F. The characteristic polynomial 
Xr(t) = det(tE — F) computed by means of such a diagonal matrix is the product 
T[,@ — 4), where A runs through the diagonal elements. We conclude that each 
A € Spec(F) appears on the diagonal, and the number of its appearances is equal to 
both dim V, and the multiplicity of the root A in y(t). Thus, up to a permutation 
of the diagonal elements, a diagonal matrix of F does not depend on the choice of 
basis in which F has a diagonal matrix. From the k[#]-module viewpoint, we can 
say that a diagonalizable operator F is similar to multiplication by ¢ in a direct sum 
of residue modules k[t]/(t — A) ~ k, where A runs through Spec F and every such 
summand appears in the direct sum exactly dim V) times. 
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Proposition 15.6 The following properties of a linear operator F : V — V are 
equivalent: 


(1) F is diagonalizable. 

(2) V is spanned by the eigenvectors of F. 

(3) The characteristic polynomial xr(t) = det(tE — F) is completely factorizable 
in k[¢] into a product of linear factors, and the multiplicity of each factor (t—) 
is equal to dim Vj. 

(4) All elementary divisors of F have the form (t — 4), where 0 € k. 

(5) The operator F is annihilated by a separable!’ polynomial f € k{t] that is 
completely factorizable over k as a product of linear factors. 


Proof The equivalences (2) <=> (1) <=> (4) and the implication (1) = (3) are 
evident from the discussion preceding the proposition. The equivalence (4) <> 
(5) follows at once from Corollary 15.3. If (3) holds, then Eemee rdimV, = 
deg yr = dim V. Hence Dj especyr) Vi = V. Thus, (3) = (1). Oo 


Corollary 15.9 The restriction of a diagonalizable operator F to an F-invariant 
subspace is diagonalizable as well. 


Proof This is provided by (5). oO 


Corollary 15.10 Over an algebraically closed field, diagonalizability is equivalent 
to semisimplicity. 


Proof The monic irreducible polynomials over an algebraically closed field are 
exhausted by the linear binomials tf — 1. Oo 


Exercise 15.16 Give an example of a nondiagonalizable semisimple linear operator 
over R. 


15.2.7. Annihilating Polynomials 


If we know a polynomial f € k[x] that annihilates an operator F : V > V and 
know the irreducible factorization of f in k[¢] as well, then by Corollary 15.3, the 
collection of elementary divisors €@(F) becomes restricted to a finite number of 
effectively enumerable possibilities. Often, this allows us to decompose V into a 
direct sum of F-invariant subspaces defined in intrinsic terms of the F-action on V. 


Example 15.2 (Involutions) Assume that chark 4 2. A linear operator 0 : V > V 
is called an involution if 07 = Idy. Thus, o is annihilated by the polynomial t?—1 = 
(t + 1)(t — 1), which satisfies condition (5) of Proposition 15.6. We conclude that 


'9That is, without multiple roots. 
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either o = +Idy or V = Vi @ V_, where 
V, = ker(o —Idy) =im(o+Idy) and V_=ker(o + Idy) = im(o — Idy) 


are --l-eigenspaces of o. An arbitrary vector v € V is decomposed in terms of 
them as 


vu+ Fu v — Fv 
of 


v=v_~_tv_= 5 i 


with vi = (v + Fv)/2 € Vx. 


Example 15.3 (Projectors) A linear operator 7 : V — V is called a projector”® 
if x* = zx. Thus, z is annihilated by the polynomial ? — ¢ = t(t — 1), which 
also satisfies condition (5) of Proposition 15.6. We conclude that either 7 = 0, or 
x = Id, or V = Vo @ Vi, where Vo = kerm and V; = {v € V | mv = v} 
consists of the fixed vectors of zr. The first two projectors are called trivial. The 
third projects V onto im z along ker z (which explains the terminology). Note that 
for every idempotent zr, the operator Idy — 7 is idempotent, too, and projects V onto 
ker z along imz. Thus, to produce a direct sum decomposition V = U © W, it is 
necessary and sufficient to choose two idempotent operators 7, = me, 2 = ne on 
V such that wz, + m2 = Idy and mm = mom = 0. 


Exercise 15.17 Deduce from these relations that kerz; = ima), andimz, = 
ker 1. 


Proposition 15.7 Assume that the operator F : V — V is annihilated by the 
polynomial q € k[t] factored in k[t] as 


4 = 41° q2°::gr, where V i,j GCD(qi, qj) = 1. 


Let Q; = Tei qv = q/qi. Then kerg(F) = imQ,(F) for each i, all these 
subspaces are F-invariant, and V is a direct sum of those of them that are nonzero. 


Proof The invariance of kerq;(F) is obvious: g(F)v = 0 => qfF)Fu = 
Fqi(F)v = 0. The inclusion im Q;(F) C kerg;(F) follows from the vanishing of 
qgi(F) ° O(F) = q(F) = 0. Let us show that kerg;(F) 9 Deve ker qu(F) = 0, 
i.e., that the sum of the nonzero subspaces ker gq;(F) is a direct sum. Since all q; 
are mutually coprime, GCD(q;, Q;) = 1 as well. Hence there exist g,h € k[f] such 
that 1 = g(A)qi(t) + h(HQ; (2). Substituting + = F, we get Id = g(F) ° gi(F) + 
h(F) ° Q(F). Applying both sides to any vector v € kerg(F) 9 Dia ker qj, 
we get v = g(F)e gi(F)v + A(F) ° O(F)v = 0, because ker Q;(F) contains 
all kerg,(F) with v # i. Finally, let us show that the subspaces im Q; span 


Or an idempotent operator. 
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: Since GCD (Q),Q2,...,Q,) = 1, there exist h,,h2,...,4, € k[f] such that 

= >°Q,(t) - h,(t). Substitute + = F and apply both sides to any v € V. We 
see iat v= 26 O(F)h(F)v C Yim Q;(F). This forces im Q;(F) = ker q;(F) ue 
@ ker qi(F) = 


15.3. Jordan Decomposition 


15.3.1. Jordan Normal Form 


In this section, we assume by default that the ground field k is algebraically closed. 
In this case, the monic irreducible elements of k[#] are exhausted by the linear 
binomials (¢t — A). Thus, by Theorem 15.1, every linear operator F : V > V is 
similar to multiplication by ¢ in the direct sum of residue modules 


Ua eee ee eee (15.7) 
(G=An)™) ((t—As)™)’ 


and two such multiplication operators in the sums 
k[t k[t k[t k[t 
| ee ec 
(= 1)") ((t— vr)") ((t— py") ((t— Hs)") 
are similar if and only if r = s, and after appropriate renumbering we get ju; = Vj, 
m; = n; for all i. Multiplication by tf = A + (t — A) in k[d/(c — A)™) is the sum of 


the homothety AE : f Af and nilpotent operator n : f KH (f—A)-f, for which the 
classes of powers 


Gah. Cad in C=, 1 (15.8) 


form a Jordan chain”! of length m. In the basis (15.8), multiplication by t has the 
matrix 


r1 
dA 1 


In(A) SXE+ Jn(0) = a ia (15.9) 
A 1 
Xr 


of size m X m with zeros in all other cells. It is called a Jordan block of size m with 
eigenvalue 1. 


“See Sect. 15.2.1 on p. 367. 
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Corollary 15.11 (Jordan Normal Form) Every linear operator F : V — V over 
an algebraically closed field k has a basis in which F has a block-diagonal matrix 


Jim (Ai) 
Jim (A2) 


(15.10) 
Jin (An) 


formed by Jordan blocks Jm,(A1), Jim (A2), «+. 5 Jim (An), where repeated blocks are 
allowed as well. Up to permutations of blocks, the matrix (15.10) does not depend 
on the choice of basis in question. Two operators are similar if and only if their 
matrices (15.10) coincide up to a permutation of blocks. oO 


Definition 15.1 The matrix (15.10) is called the Jordan normal form of the operator 
F. Every basis of V in which the matrix of F takes its Jordan normal form is called 
a Jordan basis for F’. 


15.3.2 Root Decomposition 


Let F : V — V be a linear operator over an algebraically closed field and let 
A € Spec(F) be an eigenvalue of 1. The submodule of (t—A)-torsion in Vr is called 
the root subspace of F associated with A € Spec F, or just A-root subspace for short. 
We denote it by 


Kk, ={vEeV|imeN: Ald—F)"v = 0} 


= | ker(a Id - FY" 


m>1 


=ker(AId—F)” , 


(15.11) 


where m, = mi—y)(F) is the index” of t — A in E¢(F). Since 0 # V, C K;j for all 
X € Spec(F), all the root subspaces are nonzero. The canonical decomposition of 
Vr into a direct sum of (t — A)-torsion submodules looks like V = ®jespec r Ka and 
is called the root decomposition of F. 


Exercise 15.18 Prove the existence of the root decomposition by means of 
Proposition 15.7 and the Cayley—Hamilton identity without any reference to the 
general theory from Chap. 14. 


The total number of Jordan blocks of size m with eigenvalue A in the Jordan normal 
form of an operator F is equal to the number of length-7 rows in the Young diagram 


°2That is, the maximal integer m such that (t — A)” € &(F) (see Sect. 15.1.5 on p. 365). 
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v of the nilpotent operator 
(AId— F)|x, : Ki > Ky. 
By formula (15.6) on p. 368, the kth column of this diagram has length 
vi = dimker(A Id — F)* — dimker(A Id— F)*. (15.12) 


Therefore, the Jordan normal form of F can be calculated as follows. Perform the 
irreducible factorization of the characteristic polynomial: 


ar) = [J] @-a®. 
A€Spec(F) 
Then for each A € Spec(F) and each integer k in the range 1 < k < dj, calculate 


dim ker(A Id — F)* = dim V — rk(A Id — F) 


while these numbers are strictly decreasing. Then draw the Young diagram v whose 
columns have lengths (15.12). The number of blocks J¢(A) equals the number of 
length-£ rows in v. 


15.3.3 Commuting Operators 


If linear operators F,G : V > V over an arbitrary field k commute, then for every 
f € k[d, the subspaces kerf(F) and imf(F) are G-invariant, because 

f(F)v =0 => f(F)Gv = Gf(F)v =0 

v=f(F)w > Gv=Gf(F)w=f(F)Gw. 


In particular, the eigenspaces V, = ker(F — AE) and root spaces 
Ky, = ker(A Id — F)"" 


of the operator F are sent to themselves by every linear operator commuting with F’. 


Proposition 15.8 Every set of commuting linear operators over an algebraically 
closed field has a common eigenvector. Over an arbitrary field, every set of 
diagonalizable commuting operators can be simultaneously diagonalized in a 
common basis. 


Proof By induction on dim V, where V is the space on which the operators act. 
Both statements are obvious if all the operators are scalar dilations, which holds, in 
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particular, for dim V = 1. Assume that one of the operators, call it F, is nonscalar. 
To prove the first statement, note that F has a proper eigenspace 0 # V, ¢ V sent 
to itself by all operators. By induction, all operators have a common eigenvector 
within V,. To prove the second statement, consider the decomposition of V in a 
direct sum of eigenspaces of F: V = @ jeSpec(r) Va- All operators send each V) to 
itself and are diagonalizable on V, by Corollary 15.9 on p. 373. Thus, they can be 
simultaneously diagonalized on each Vj. oO 


15.3.4 Nilpotent and Diagonalizable Components 


Theorem 15.3 (Jordan Decomposition) For every linear operator F over an 
algebraically closed field k, there exists a unique pair of operators F,, F, such that 
F,, is nilpotent, F; is diagonalizable, F = F, + Fy, and F;F, = F,F;. Moreover, 
F, = f.(F) and F, = f,(F) for appropriate polynomials f,,fs € kt] without 
constant terms. In particular, F, and F, commute with every operator commuting 
with F, and all F-invariant subspaces are invariant for both operators Fy, F, as 
well. 


Proof Realize F as multiplication by f in a direct sum of residue modules 


k[¢] a k[d (15.13) 
(=A) ((t—As)™) | , 
and let Spec F = {A,Ao,...,A,}. For eachi = 1, 2, ... , r, let us fix some integer 


a; > my,(F), where m),(F) is the maximal integer m such that (t — 1;)" € E(F). 
By the Chinese remainder theorem, there exist fi,fo,....f € k[f] such that 


1 mod(t—A,)”, 


i 0 mod(t—A,,)“ for #v. 


If A, 4 0, then the polynomial ¢ is invertible modulo (t — A,)“’, ie., there exists 
a polynomial g,(¢) such that ¢- g,(t) = A, mod(t—A,)*. For A, = 0, we put 
g, = 0. Therefore, 


fe) 25° gy (Of) = Ay mod (t= Ay)” 


v=1 


for each v. The polynomial f;(t) has zero constant term, and multiplication by 
fs(t) acts on each direct summand k[1]/ ((t — 4,,)”) in the decomposition (15.13) as 
multiplication by 1,. Thus, the operator F, & f,(F) is diagonalizable. The operator 
F, & F — F, acts on k[f]/ ((t—A,)”) as multiplication by t — A,. Hence, F,, is 


nilpotent. As polynomials in F, both operators F;, F,, commute with F and with 
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each other. This proves the existence of F;, F;, as well as the last two statements. 
It remains to check uniqueness. Let F = F’ + F’, be some other decomposition 
satisfying the first statement. Since F’ and F’, commute, they both commute with 
F = F’ + F’_ and with every polynomial of F including F, and F,,. Therefore, each 
A-eigenspace V, of F, is F’-invariant, and the restriction F’ Vy is diagonalizable. 
This forces FY |y, = A-Idy, = Fs|v,, because the difference F, — F, = F’, — F,, is 
nilpotent and therefore has zero spectrum. 


Exercise 15.19 Check that the sum of two commuting nilpotent operators is also 
nilpotent. 


We conclude that F, = F, and therefore F/ = F — F. = F — F, = Fy. oO 


Definition 15.2 The operators F, and F,, from Theorem 15.3 are called the 
semisimple and nilpotent components of the operator F. 


15.4 Functions of Operators 


15.4.1 Evaluation of Functions on an Operator 


In this section we always assume that k = C. Let F : V — V be a linear operator 
with SpecF = {Aj,A2,...,A,}. Recall that we write m, = mc—j)(F) for the 
maximal integer m such that (t — 4)” € €€(F) and write 


we = [] @-am 


A€Spec F 


for the minimal polynomial of F. We say that a C-subalgebra C in the algebra C© 
of all functions C — C is suitable for evaluation at F if C D> C[f] and for each 
A € Spec F andf € C, there exists g € C such that 


U (my—1) 
F() = f(a (2=A)= ew z—Ay—! 4.9(z)-(z—-Ay™ , (15.14) 
(m, — 0)! 


where f) = d*f/dz* means the kth derivative. For example, suitable for evaluation 
at F is the algebra of all functions f : C — C such that for each A € Spec F, 
the Taylor expansion of f at A has nonzero radius of convergence. In particular, the 
algebra of all analytic functions? C — C is suitable for evaluation at any linear 
operator F’. Our terminology is justified by the following claim. 


3That is, power series converging everywhere in C. 
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Theorem 15.4 Let a subalgebra C C C®© be suitable for evaluation at an 
operator F : V — V. Then there exists a unique homomorphism of C-algebras 
evr : C — EndV whose restriction to the polynomial subalgebra C[t] C C 
coincides with the standard evaluation of polynomials f(t) > f(F). Moreover, for 
every f € C, there exists a polynomial Py € C[t] such that f(F) = Pyr(F). The 
residue class of the polynomial Py modulo ({1F) is uniquely determined by the 
equalities PO) = f(A) for all0 < k < m, — 1 andall Xd € Spec F, where Py 
means the kth derivative. 


Proof Let evr : C — C be a required homomorphism of C-algebras. Fix an 
arbitrary function f € C and evaluate both sides of (15.14) at F. This leads to the 
equality 


foe) 


(m= 0! (F—-AE)™—' +9) (F)(F-AE)™. 


(15.15) 


FP) = fQ)-E+f' A) F-AE)+ + + 


Since the last summand annihilates the A-root subspace K, = ker(F — AE)” of F, 
the operator f(F’) sends K) to itself and acts there as f,(F), where 


AO =fO)+ PC): GHA tf PO) G=A)y™ /@u- I! (5.16) 


is the (m, — 1)th jer of the function f at the point A € C. By the Chinese remainder 
theorem, there exists a unique residue class [Py] € k[t]/ (ur) congruent to fy (1) 
modulo (t— A)” for all A € Spec F. Therefore, f(F) acts on each root subspace K, 
exactly as Pr -(F). Hence, f(F) = Py,-(F) on the whole space V = © deSpec F Ka- 
This establishes the uniqueness of evaluation and the last statement of the theorem. 
It remains to check that the rule f(F) = Py -(F) actually defines a homomorphism 
of C-algebras C — End V. Write j{’ : C — C[f]/ ((t— A)”) for the map sending a 
function f € C to its (m— 1)th jet at A, 


m—1 


1 
if = DSP) @-A}k mod(e— 4)", 
k=0 ~ 


and consider the direct product of these maps over all points A € Spec F: 


j:C> [I Ci/(@-’)™) ~ Cl/ (ur) . 
A€Spec F (15.17) 


$2 feo sy). 


Exercise 15.20 Verify that (15.17) is a homomorphism of C-algebras. 


15.4 Functions of Operators 381 


Our map f +> Pr -(F) is the homomorphism (15.17) followed by the standard eval- 
uation of polynomials evr : C[#]/ (47) — End V. Therefore, it is a homomorphism 
of C-algebras as well. Oo 


15.4.2 Interpolating Polynomial 


A polynomial Pr € C[z] from Theorem 15.4 is called an interpolating polynomial 
for the evaluation of a function f € C at an operator F : V — V. Note that it 
crucially depends on both the function and the operator. In particular, the values 
of the same function on different operators are usually evaluated by completely 
different interpolating polynomials. At the same time, an interpolating polynomial 
is defined only up to addition of polynomial multiples of the minimal polynomial 
lip. In practical computations, if the irreducible factorization of the characteristic 
polynomial is known, 


Xr = I] (¢-Ay™, 


A€Spec F 


then certainly N, > m,, and we can take Py to be equal to the unique polynomial 


of degree at most dim V such that Pra) = f(A) for all A € Spec F andO0 <k < 
N,-1. 


Example 15.4 (Power Function and Linear Recurrence Equations) The solution of 
the linear recurrence equation 


ZA AZ R—-1 + A2Z-2 + t+ + nZm—n = O (15.18) 


of order m in the unknown sequence (z,) is equivalent to evaluation of the power 
function t +> f” on the shifting matrix of the sequence, 


00 ---0 Gp 
10 7°. 0 Q@m—1 
DSO es Is 
0 a 
O--- O01 a, 


which has size m x m and is uniquely determined by 


(Ze+1, Zk+2s 22 5 Zk+m) -S= (2-42; Zkt35 +225 Zktm+1) « 
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Indeed, the nth term of the desired sequence is equal to the first coordinate of the 
vector 


(Zns Zntls +++ > Actm—1) = (Zo, Z15 +++ 5 Zm—1)* 8”, 


as soon the initial terms (zo, Z1, ..., Zm—1) are given. We know from Theorem 15.4 
that S” can be evaluated by substitution t¢ = S into the interpolating polynomial 
P» s(t), whose degree is at most m, and the coefficients are found from a system of 
m+ 1 usual linear equations. Note that as soon as S” has been found, we can solve 
the same recurrence equation for different initial data almost without additional 
computations. 


Example 15.5 (Fibonacci Numbers Revisited) The shifting matrix for the Fibonacci 
01 
11 
?—ttrS+detS = 2 —t—1 = (t—A4)(t—A_) has roots Az = (1  V5)/2, 
and the power function z” takes values A”_ on them. The interpolating polynomial 
Px s(t) = at + bis linear. Its coefficients are found at once by Cramer’s rules from 
the equations 


numbers”* z, = Zp—1 + Zn-2 is S = ( ) Its characteristic polynomial ys5(t) = 


ad, +b=A", 
ApH HK, 


and they are equal to a = (A, —A")/(Az—-A_), b = (ATT —Artl) / 
(A+ — A_). Therefore, 


st = as +bE=(" - ). 
aat+b 


For the classical initial terms z = 0, z;} = 1, we get (Z,Z41) = (0,1)-S" = 
(a,a+b),ie., 


((1+¥5)/2) - (0-4) 2) 
V5 


Proposition 15.9 Under the conditions of Theorem 15.4, the spectrum Specf (F) 
consists of the numbers f(A), A € Spec F. If f’(A) 4 0, then the elementary divisors 
(t—A)” € EC(F) are in bijection with the elementary divisors (t—f(A))” € &(f(F)). 
If f’(A) = 0, then each elementary divisor t— 4 € E{(F) produces an elementary 
divisor t—f(A) € E€ (f(F)), and each elementary divisor (t—A)" € E€(F) withm > 1 


fn = a= 


>4Compare with Example 4.4 on p. 81. 
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produces more than one elementary divisor (t =f)’ € &f (f(F)) with € < m. The 
whole of E€ (f(F)) is exhausted by the elementary divisors just described. 


Proof Realize F as multiplication by ¢ in 


Cl Cla 


G1). G2 


The proof of Theorem 15.4 shows that the semisimple and nilpotent components of 
the operator f(F) restricted to the (t— 2)-torsion submodule Ky are f,(F) = f(A)-Id 
and f,(F) = f’(A)-n+ ; f"(A)- 7? + +--+, where n denotes the nilpotent operator 
provided by multiplication by (t—A). The operator 7 has exactly one Jordan chain of 
maximal length k on each direct summand C[f]/ ((¢ - A)‘) within Ky. If f(A) 4 0, 
then f!(F) = f’(A)! . nk! & 0. This forces f,,(F) to have exactly one Jordan 
chain of length k on C[/]/ ((t — A)‘) as well. If f/(A) = 0 andm > 1, we get f"(F) = 
0 for some £ < k. Therefore, the cyclic type of f,(F) restricted to C[#]/ ((¢ - A)*) 
consists of more than one Jordan chain, which all are strictly shorter than k. Oo 


Exercise 15.21 Show that the matrix J>!(A) inverse to the Jordan block”> of size 
m is conjugate to the Jordan block Jn(A7!). 


15.4.3, Comparison with Analytic Approaches 


Analytic methods to extend the evaluation map 
evr : C[z] > Mat,(C), zr F € Mat,(C), 


to some larger algebra C D C[z] of functions C > C usually provide both spaces C, 
Mat, (C) with suitable topologies, then approximate functions f € C by sequences of 
polynomials f, converging to f as v > oo, and then define f(F) as a limit of matrices 
f. (F). Of course, one should check that f(F’) depends only on f but not on the choice 
of sequence approximating f and verify that the resulting map evr : C > Mat, (C) is 
a homomorphism of C-algebras”° (otherwise, it would probably be of no use). But 
whatever sequence of polynomials f,, is used to approximate f, the corresponding 
sequence of matrices f,(F) lies in the linear span of powers F”,0 < m < n—1. 


>See formula (15.9) on p. 375. 

6lt is highly edifying for students to realize this program independently using the standard 
topologies, in which the convergence of functions means the absolute convergence over every 
disk in C, and the convergence of matrices means the convergence with respect to the distance on 
Mat, (C) defined by |A, B| = ||A — Bll, where ||C]] & maxyec».o |Cv|/|v| and |w|? & ¥ |z|? 
for w = (Z%,22,---; Zn) € C”. A working test for such a realization could be a straightforward 
computation of e/") directly from the definitions. 
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Thus, if the linear subspaces of Mat,,(C) are closed in the topology used to define 
the evaluation, then lim f,(F) has to be a polynomial in F of degree at most n — 1. 
In other words, whatever analytic approach is used to construct the homomorphism 
evr :C — Mat, (C), its value on a given function f € C should be computed a priori 
by Theorem 15.4. 


Problems for Independent Solution to Chap. 15 


Problem 15.1 Give an explicit example of a noncyclic?’ nonidentical linear 
operator. 


Problem 15.2 Describe all invariant subspaces of an operator F : k” — k” whose 
matrix in the standard basis is diagonal with distinct diagonal elements. 


Problem 15.3 Show that the minimal polynomial of every matrix of rank | over an 
arbitrary field is quadratic. 


Problem 15.4 Is there some linear operator F over Q such that (a) wr = 7 — 1 
and yr = (© —1, (b) wr = (@ — 1) — 2) and yr = (¢ — 1) - 2), 
(c) we = (t—1)°(t — 2) and yr = (t — 1)°(t — 2)? If yes, give an explicit 
example, if not, explain why. 

Problem 15.5 For p = 2, 3,5, enumerate the conjugation classes of the matrices in 
(a) Mat (F,), (b) GL» (F,), (c) SL2(F,). 

Problem 15.6 Let the characteristic polynomial of an operator F : V — V be 
irreducible. Show that for every v € V ~ 0, the vectors v, Fu,... , Famv—ly 
form a basis in V. 


Problem 15.7 Let the degree of the minimal polynomial of an operator F : V > V 
be equal to dim V. Show that every operator commuting with F can be written as 
a polynomial in F. 

Problem 15.8 Given an operator F over an algebraically closed field k, let the 
operator G commute with all operators commuting with F’. Is it true that G = f(F) 
for some f € k[¢]? 

Problem 15.9 Is there in Mat,(C) some (n + 1)-dimensional subspace consisting 
of mutually commuting matrices? 

Problem 15.10 For commuting operators F, G, prove that (F + G); = F, + Gs 
and (F + G), = F, + Gy, where C;, C, mean the semisimple and nilpotent 
components”® of the operator C. 

Problem 15.11 Let the matrix of an operator R” — R” have Aj, A2,...,A, on the 
secondary diagonal and zeros everywhere else. For which 41, A2,...,A, is this 
operator diagonalizable over R? 


°7See Sect. 15.2.3 on p. 370. 
?8See Definition 15.2 on p. 379. 
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Problem 15.12 Let a Q-linear operator F satisfy the equality F? = 6 F?7-11F+6E. 
Can F be nondiagonalizable over Q? 


Problem 15.13 Give an explicit example of two diagonalizable operators with 
nondiagonalizable composition. 


Problem 15.14 Let the minimal polynomial of an operator F : V — V be factored 
as Ur = £182, where GCD(g\, 92) = 1. Show that V = U; © Uy, where both 
subspaces are F-invariant and the minimal polynomial of the restriction F|,, 
equals g; for both = 1, 2. 

Problem 15.15 Let an operator F : k” — k” have tr F* = 0 forall 1 < k < n. Show 
that F is nilpotent. 


Problem 15.16 Let operators A, B satisfy the equality AB — BA = B. Show that B 
is nilpotent. 


Problem 15.17* (Bart’s Lemma) Show that for every A,B € Mat,(C) such that 
rk[A, B] = 1, there exists a nonzero v € C” such that Av = Av and Bu = pv for 
some A, uw EC. 


Problem 15.18 Establish a bijection between the direct sum decompositions V = 
U; ®@ U2 ® --- @ Us and collections of nontrivial projectors 7, 72,...,H5 € 
End V such that 2) + m2 +---+ a5 = land ajnj = njx; = 0 for alli Fj. 


Problem 15.19 Find the Jordan normal form of the square J? (A) of the Jordan block 
Jin(A) (a) for A 4 0, (b) for A = 0. 
Problem 15.20 Write finite explicit expressions for the matrix elements of f (Jin (A)) 


in terms of f and its derivatives at A for every function f : C — C analytic in some 
neighborhood of A € C. 


Problem 15.21 Describe the eigenspaces and root subspaces of the following linear 
operators: 


(a) d/dx acting on the R-linear span of functions sin(x), cos(x), ...., sin(mx), 
cos(nx); 

(b) d/dz acting on the functions C > C of the form e*“f(z), where f € C[z]<, and 
A € Cis fixed; 

(c) xd/dx acting on Q[x]<, and >> x; 0/0x; acting on Q[x1,.x2,.... XnJen’ 

(d) f(x) B f@—1,y + 1) acting on the Q-linear span of the monomials x"y” with 
0<m,n <2; 

(e) FQ) > fy @y + 29°) 0) dy acting on Ris] <3; 

(f) f(x) B f(ax + b), where a,b € Q are fixed, acting on Qi] <p. 


Problem 15.22 Solve in Mat (C) the equations 
3 1 62 
r= = : 
(a) (3 a (b) é ) 
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Problem 15.23 Compute A*°, sinA, and e“ for the following matrices A: 


(!, em (3) 10(12)0(23} 


Problem 15.24 Two operators Q? — Q? have in the standard basis of Q? the 
matrices 


5 -1-1l —6 2 3 
-1 5 -l and 2 —36 
—-1-1 5 3 62 


Find all subspaces in Q? invariant simultaneously for the both operators. 


Problem 15.25 Find the minimal polynomial over R and over C for the matrix 


5 5 1 —10 
25 —2 —-9 
O-1 1 0 

3 0 -8 


and write the Jordan normal form of this matrix over C. 


Problem 15.26 Find the Jordan normal forms over the field C for the following 
matrices: 


21-27 31-39 2 ee: 
04-45 24-69 6 Ib 0 
b oJ > 
@) 153 67 (3379 ©) s 19 9-6 
11-32 11-32 Hatt AS 
nn—-Iln—-2-::--1 OPT Ot 0 
Orde eta? 001 
(dy Oe eS ee) Se, Fan GH 
a cae 
me Ht 105 3210-0 


Chapter 16 
Bilinear Forms 


In this section we assume by default that chark # 2. 


16.1 Bilinear Forms and Correlations 


16.1.1 Space with Bilinear Form 


Let V be a finite-dimensional vector space over a field k. We write V* for its dual 
space and (*, *) : V x V* — ki for the pairing between vectors and covectors.! 
Recall that a map B : V x V — k is called a bilinear form on V if for all A € k, 
u,w € V, we have 


B(u, Aw) = AB(u, w) = B(Au, w), 
B(uy + uz, wy + W2) = B(uy,w1) + Buy, W2) + B(u2, w1) + B(u2, 2). 


An example of a bilinear map is provided by the Euclidean structure on a real vector 


space.” 


Exercise 16.1 Check that the bilinear maps form a vector subspace in the vector 
space of all functions V x V > k. 


Associated with a bilinear form B : V x V — k is the right correlation map 


B:V7>V*, vt B(x,v), (16.1) 
'See Example 7.6 on p. 161. 
See Sect. 10.1 on p. 229. 
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which sends a vector v € V to the covector u b> 6(u, v). Every linear map (16.1) is 
a correlation map for a unique bilinear form on V, which is defined by 


B(u,w) = (u, Bw) . (16.2) 


Exercise 16.2 Verify that the correspondence (16.2) establishes a linear isomor- 
phism between the vector space of bilinear forms on V and the vector space 
Hom(V, V*) of linear maps V > V*. 


We have deliberately denoted both the bilinear form and its right correlation by 
the same letter. In what follows, we will often make no distinction between them 
and will identify the space of bilinear forms with Hom(V, V*). Note that both have 
dimension n” as soon as dim V = n. We call a pair (V, B) a space with bilinear form 
or just a bilinear form over k. For two such spaces V1, V2 with bilinear forms 61, B2, 
a linear map f : Vj — V2 is called isometric or a homomorphism of bilinear forms 
if Bi(v, w) = Bo(f(v), f(w)) for all v, w € V1. In the language of right correlations, 
this means that the following diagram is commutative: 


f 
V, —> Vy (16.3) 
ie., B = f*Bof, where f* : V3 — V; is the dual linear map? to f defined by 
(uv, f*&) = (fv, &) forall Ee V¥ ve V. 


Exercise 16.3 Convince yourself that the relation 6;(u,w) = B2 ( fu). f (w)) for 
all u, w € V, is equivalent to the commutativity of diagram (16.3). 


Two bilinear forms 6), 62 on spaces V;, V2 are called isomorphic or equivalent if 
there exists an isometric linear isomorphism of vector spaces V, > Vo. 


16.1.2. Gramians 


Let the vectors e = (€1, €2,...,€,) be a basis of V. Then every bilinear form 6 on 
V is uniquely determined by its values by = f(e;, e;) on the pairs of basis vectors. 
Indeed, given these values, then for every pair of vectors u = )° xje;,,w = >> Vjejs 
we have 


Blu, w) = B( So xe; , > wes) = > DiXi¥j : (16.4) 
i J i 


3See Sect. 7.3 on p. 164. 
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The square matrix B, = (bi) is called the Gram matrix of B in the basis e or simply 
the Gramian. 


Exercise 16.4 Check that the matrix of right correlation 6 : V > V* written in the 
dual bases e and e* of V and V* coincides with the Gram matrix B, of the bilinear 
form 8 : V x V > k in the basis e. 


If we write the basis vectors in the row matrix e = (e€1, €2,...,€,) with elements in 
V and for u,w € V put uv © B(u, w) € Kas in Sect. 10.2 on p. 233, then B, = e’e, 
and the computation (16.4) for the vectors u = ex, w = ey, where x,y € k” are 
coordinate columns, can be rewritten as 


uw = u'w = x'e'ey = x'Bey. (16.4’) 
More generally, associated with any two collections of vectors 
u = (Uy, U2,..., UK), W= (WI,W2,...,Wm), 


is their reciprocal Gramian B,y = (6 (ui, wj)) = u'w. The same computation as in 
Sect. 10.2 on p. 233 shows that for u = eC,, andw = fCyy, we have 


Binp = u'w = (eCen)’ (f Cw) = Cet Crw = CB Crs (16.5) 
In particular, for any two bases e, f in V related by f = eC,¢, we have 
Br = Cop BeCer - (16.6) 


It is instructive to compare this formula with the diagram (16.3). 


16.1.3 Left Correlation 


For a right correlation B : V > V%*, its dual map 6* : V** — V* can be also 
viewed as a correlation B* : V — V* by means of the canonical identification* 
v** ~ V. Correlations 6 and £* are related by the equality (u, B*w) = (w, Bu) 
for all u,w € V. The correlation 6* is a right correlation for the bilinear form 
B*(u,w) = (u, B*w) = (w, Bu) = B(w, u) constructed from 6 by swapping the 
arguments. In the language of bilinear forms, the correlation B* : V > V* sends a 
vector v € V to the covector w +> B(v, w), i.e., maps v  B(v, *). For this reason, 
B* is called the left correlation map of the bilinear form f. 


4See Sect. 7.1.2 on p. 158. 
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Exercise 16.5 Check that the matrix of the left correlation 6* : V — V* in the 
dual bases e, e* of V and V* equals Bi. 


16.1.4 Nondegeneracy 


A bilinear form f is called nondegenerate? if it satisfies the equivalent conditions 
listed in Proposition 16.1 below. Otherwise, B is called singular.® 


Proposition 16.1 (Nondegeneracy Criteria) Let a bilinear form B : Vx V > k 
have Gramian B, in some basis e = (€), €2,...,€n) of V. The following conditions 
on B are equivalent: 


(1) detB, £0. 

(2) Vwe V~NOdue V: B(u,w) £0. 

(3) The left correlation map B* : V + V* is an isomorphism. 
(4) VEE V*, dug © V: E(v) = Blug, v) forally € V. 

(5) Vue V0, JweV: B(u,w) £0. 

(6) The right correlation map B : V + V* is an isomorphism. 
(7) VE EV*, dwe €V: E(v) = B(v, we) forall v € V. 


If these conditions hold, then condition (1) is valid for every basis of V, and both 
vectors ug, wg in (4), (7) are uniquely determined by the covector &. 


Proof Conditions (2) and (4) mean that ker B* = 0 and im f* = V* respectively. 
Each of them is equivalent to (3), because dim V = dim V*. For the same reason, 
conditions (5), (6), (7) are mutually equivalent as well. Since the operators 6, B* 
have transposed matrices Be, B’, in the dual bases e and e*, the bijectivity of any one 
of them means that det B) = det B, # 0. Oo 


16.1.5 Kernels 


If a bilinear form f is singular, then both subspaces 


Vt =kerB = {ue V|VveEV B(v,u) = 0}, (16.7) 
V+ =ker B* = {ue V| Vv EV Blu, v) = 0}, (16.8) 


Or nonsingular. 
Or degenerate. 
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are nonzero. They are called the right and left kernels of the bilinear form 6. In 
general, Vt 4 V+. Nevertheless, dim V+ = dim V+, because f and f* are dual to 
each other and therefore have equal ranks.’ 


Definition 16.1 The nonnegative integer rkB = dimimB = dimimf* = 
codimker 6 = codimker 6* = rk B, where B is the Gramian of 6 in an arbitrary 
basis, is called the rank of the bilinear form f. 


16.1.6 Nonsymmetric and (Skew)-Symmetric Forms 


A dualization 8 +> f* is a nontrivial linear involution® on the space Hom(V, V*) 
of correlations. Therefore, the space of correlations splits into a direct sum of +1- 
eigenspaces of the involution *: 


Hom(V, V*) = Hom, (V, V*) 6 Hom_(V, V*), 


where Hom,(V,V*) consists of symmetric correlations 6 = 6*, whereas 
Hom;(V,V*) is formed by skew-symmetric correlations 8 = —f*. Such 
(skew) symmetric correlations produce symmetric and skew-symmetric bilinear 
forms, which satisfy for all u,w € V the identities B(u,w) = B(w,u) and 


B(u, w) = —B(w, uv) respectively. 


Exercise 16.6 Find dim Hom, (V, V*) for n-dimensional V. 


All other bilinear forms B(u, w) # +6(w, uv) are called nonsymmetric. Associated 
with such a form f is the 2-dimensional subspace in Hom(V, V*) spanned by 6 and 
B*. We call it the pencil of correlations of the bilinear form f and denote it by 


Ilg = {t1B* —tB: V > V* | to,t) € k} C Hom(V, V*). (16.9) 


These correlations produce bilinear forms B,1,)(u.w) = tiB(u,w) — toB(w, uv). 
For chark # 2, among these forms there exists a unique, up to proportionality, 
symmetric form B+ = (f + 6*) /2 and skew-symmetric form B_ = (6 — B*) /2. 
Note that 6 = 64+ f_, and this equality is the unique decomposition of 6 as asum 
of symmetric and skew-symmetric forms. The forms B+ are called the symmetric 
and skew-symmetric components of B. 


Exercise 16.7 Give an example of a nondegenerate bilinear form 6 whose symmet- 
ric and skew-symmetric parts 64 are both singular. 


7In dual bases, the matrices B!, B. of these operators are transposes of each other and therefore 
have equal ranks by Theorem 7.3 on p. 166. 


8See Example 15.2 on p. 373. 
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16.1.7 Characteristic Polynomial and Characteristic Values 


The determinant 
Xp (to, t) # det (t1Be - toB) € k[to, t] 


is called the characteristic polynomial of the bilinear form 8. It is either homoge- 
neous of degree dim V or vanishes identically. The transposition of variables fo <> f, 
multiplies yg by aye 


Xp (to. ty) = det (t1Be _ toB) = det (t1Be - toBi)' (16.10) 
= det (4B, — toBe) = (—1)"™" xg (ty, to). (16.11) 


Up to multiplication by a nonzero constant, yg does not depend on the choice of 
basis e whose Gramian B, is used to evaluate the determinant, because in another 
basis € = eCez, 


det (1)Be — toBe’) = det (t1C!,BeCee — tC’, Bi Cee) (16.12) 


eee 


= det(C),) det (t;B, — toB)) det (C,.) = det (t1B, — Bi) - det” Cy. . 


A bilinear form f is called regular if yg(to,t,) is a nonzero polynomial. In this 
case, the roots of the characteristic polynomial on the projective line P; = P(IIg) 
are called the characteristic values of B. We write them as A = fo/t, € KLoo, where 
0 = 0: 1 and co = 1: 0 may appear as well if the regular form is degenerate. The 
formula (16.10) forces the characteristic values different from +1 to split into the 
inverse pairs A, A—! of equal multiplicities. 

It follows from (16.12) that the characteristic values do not depend on the 
choice of basis. The computation (16.12) also shows that isomorphic bilinear forms 
have the same characteristic values. The characteristic values of the homogeneous 
coordinate t = (fo : t;) correspond to the degenerate correlations 6, = t;B —toB* € 
IIg in the pencil (16.9). Every regular form 6 has at most dim V characteristic 
values (considered up to proportionality and counted with multiplicities). All the 
other correlations in the pencil I1g are nonsingular. 

The characteristic polynomial of a (skew) symmetric bilinear form £, 


det (4;B, — toB,) = det (t)By F toBe) = (ty F to)" det Be, 


is nonzero if and only if 6 is nondegenerate. Thus, the regularity of a 
(skew) symmetric bilinear form is equivalent to nonsingularity. Each nonsingular 
(skew) symmetric form has exactly one characteristic value. It has maximal 
multiplicity dim V and is equal to (1 : 1) in the symmetric case and to (1 : —1) in 
the skew-symmetric case. The degenerate form that corresponds to this value is the 
zero form. 
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Example 16.1 (Euclidean Form) The symmetric bilinear form on the coordinate 
space k” with unit Gramian E in the standard basis is called the Euclidean form. 
For k = R, it provides R” with a Euclidean structure. For other ground fields, 
the properties of the Euclidean form may be far from those usual in Euclidean 
geometry. For example, over C, the nonzero vector e; — ie, € C? has zero inner 
product with itself. However, the Euclidean form is nonsingular. The bases in which 
the Gramian of the Euclidean form equals E are called orthonormal. The existence 
of an orthonormal basis for some bilinear form f means that 6 is equivalent to the 
Euclidean form. In Corollary 16.4 on p. 410 below we will see that every symmetric 
nonsingular bilinear form over an algebraically closed field k of characteristic 
chark # 2 is isomorphic to the Euclidean form. 


Exercise 16.8 Find an n-dimensional subspace U C C?” such that the Euclidean 
form on C” is restricted to the zero form on U. 


Example 16.2 (hyperbolic Form) A symmetric bilinear form on an even-dimen- 
sional coordinate space k*” with Gramian 


_(0E 
H= @ a (16.13) 


in the standard basis is called a hyperbolic form. Here E and 0 mean the identity and 
the zero n x n matrices. Therefore, det H = (—1)" and H is nondegenerate. Over an 
algebraically closed field, the hyperbolic form is equivalent to the Euclidean form 
and admits an orthonormal basis formed by the vectors 


Edv-1 = (ey _ €ntv) / Vv—2 and E2y = (ey ar Entv) /V2, 1 Sven. 


Over R and Q, the hyperbolic form is not equivalent to the Euclidean form, because 
for the latter form, the inner product of each nonzero vector with itself is positive, 
whereas the hyperbolic form vanishes identically on the linear span of the first 
n standard basis vectors. Every basis of k?” with Gramian (16.13) is called a 
hyperbolic basis. 


Example 16.3 (Symplectic Form) A skew-symmetric form on an even-dimensional 
coordinate space k*” with Gramian 


_(0E 
J= (i - (16.14) 


in the standard basis is called symplectic. A matrix J in (16.14) is called a symplectic 
unit. thas J? = —E and det J = 1. Thus, a symplectic form is nondegenerate. Every 
basis of k*” with Gramian (16.13) is called a symplectic basis. In Theorem 16.5 on 
p.410 below, we will see that every nondegenerate skew-symmetric bilinear form 
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over any field is isomorphic to a symplectic form. In particular, this means that every 
space with a nonsingular skew-symmetric form must be even-dimensional. 


Exercise 16.9 Check that every skew-symmetric square matrix of odd size is 
degenerate. 


Example 16.4 (Euler Form) Write D = d/dt : k[t] > k[¢] for the differentiation 
operator and V = k[D]/(D"*') for the quotient ring of the ring k[D] of linear 
differential operators with constant coefficients’ by its principal ideal spanned by 
D"*!. There is a reverse of direction involution on V defined by 


O(D) + &*(D) # &(-D). 


It takes the shift operator T = e” : f(t) +> f(t + 1) to the inverse shift operator 
T =e % =T!: f(t) & f(t— 1). Let us identify the space V* dual to V with 
the space k[f]<, of polynomials in t of degree at most n,'° by means of the perfect 
pairing'! 


k[D 
Sat Xlklen >it, (®.F)= O70) = ew (ODI). 16.15) 
which contracts the residue class ®(D) (mod D’T') and polynomial f(t) € k[f]<, to 
the constant term of the polynomial ®*f. 


Exercise 16.10 Verify that the pairing (16.15) is well defined and perfect.'* Find 
the basis of V dual to the standard monomial basis 1, t, ... , ¢’ in V*. 


Put y,(t) # (7) = (t+ 1)(t+ 2)--- (t+ n)/n! € V* and define the correlation 
h:V —> V* byh : © b> yy. The bilinear form h(®,Y) = O* Wy, (0) 
corresponding to this correlation is called an Euler form." In the basis formed by the 
iterated shift operators 1, T, T?, ... , 7”, the Euler form has an upper unitriangular 


This ring is isomorphic to the ring of polynomials k[x]. It consists of differential operators 
®(D) = ayD" + a;D""! +--+ + a,—1D + ay, which act on functions of f, e.g., on linear spaces 
k[z] and k[¢] as in Sect. 4.4 on p. 88. Note that V models the “solution space” of the differential 
equation d"y/dt" = 0 in the unknown function y = y(?). 

10That is, with the actual solution space of the differential equation d"y/dt" = 0. 

"Note that it differs from that used in Problem 7.2 on p. 167 by changing the sign of D. 

See Sect. 7.1.4 on p. 160. 

'3]t plays an important role in the theory of algebraic vector bundles on P,,. In this theory, the roles 
of the vector spaces V and V™ are respectively played by the Z-module of difference operators 
Z[V]/(V"*!), where V = 1 — e?, and by the Z-module of integer-valued polynomials: M, = 
{f € Q[d<, | f(Z C Z} (compare with Problem 14.30 on p. 359). Elements of the first module 
are the characteristic classes of vector bundles, whereas the elements of the second are the Hilbert 
polynomials of vector bundles. The formula computing the Euler form in terms of characteristic 
classes is known as the Riemann—Roch theorem. 
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Gramian with elements 


0 for j <i, 


(HR) forj ei. oe 


AEP) = 7 y,(0) = 


Every basis in which the Gramian of a nonsymmetric bilinear form 6 becomes upper 
unitriangular is called an exceptional’ (or semiorthonormal) basis for B. 


16.2 Nondegenerate Forms 


16.2.1 Dual Bases 


In this section we consider a vector space V equipped with a nondegenerate bilinear 
form B. Let e = (e1, €2,...,€n) be any basis of V. The preimages of the dual basis 
e* = (ey, e5,...,e7) in V* under the left and right correlation maps 6*, B : V => V* 
are respectively denoted by Ye = (Ye, Ye2,..., Yen) andeY = (ey, ey,...,e”) and 
called the left and right bases on V dual to e with respect to the bilinear form £. 
Both dual bases are uniquely determined by the following orthogonality relations: 


1 fori =j, 


= (16.17) 
0 fori¥j, 


B (“ei,e)) = B (ei.e7) 


-1 
and are expressed through e as Ye = e (B;) and eY = eB7'. Once one of the 
dual bases of e is known, the coordinates of every vector v € V in the basis e can be 
computed as the inner products of v with the dual basis vectors: 


v=) B(Ye,0)-e =>) B(v,ey)-e. (16.18) 


16.2.2 Isotropic Subspaces 


A subspace U C V is called isotropic for a bilinear form f if 
Vu,weU B(u,w) = 0. 


For example, the linear spans of the first n and of the last n standard basis vectors in 
k" are isotropic for both the hyperbolic and symplectic forms on k~” considered in 


'4For details on exceptional bases of the Euler form over Z, see Problem 16.17 on p.419. 
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Example 16.2 and Example 16.3 above. Note that all 1-dimensional subspaces are 
isotropic for all skew-symmetric forms. 


Proposition 16.2 The dimension of an isotropic subspace U of an arbitrary 
nondegenerate bilinear form on V is bounded above by the inequality. 


2dim U < dimV 


Proof A subspace U C V is isotropic for 6 if and only if 6 : V > V* sends U 
to AnnU C V*. Since f is injective by the nondegeneracy assumption, dimU < 
dim Ann U = dim V — dim U. oO 


Remark 16.1 Examples provided by symplectic and hyperbolic forms show that the 
upper bound from Proposition 16.2 is exact. 


16.2.3 Isometry Group 


Every isometric!’ endomorphism g : V —> V satisfies the relation g*Bg = 8, 
which forces det*gdet 8 = det. Since detf 4 0 for a nondegenerate form B, 
we conclude that detg = +1, and therefore g is invertible with g-! = B-!g*B. 
The composition of isometries is clearly an isometry. Thus, all isometries of a given 
nondegenerate form B on V form a group. It is called the isometry group'® of B 
and is denoted by Og(V). Isometries of determinant 1 are called special and form a 
subgroup denoted by SOg(V) = Og(V) N SL(V). 


16.2.4 Correspondence Between Forms and Operators 


A nondegenerate bilinear form 6 on V produces a linear isomorphism between the 
vector spaces of linear endomorphisms of V and bilinear forms on V, 


EndV = Hom(V,V*), f+ Bf, (16.19) 


which sends a linear operator f : V — V to the bilinear form Bf(u, w) = B(u, fw). 
The inverse isomorphism takes the correlation y : V — V* to the endomorphism 
f = BW: V = V. The isomorphism (16.19) establishes a bijection between 
invertible operators and nonsingular bilinear forms. 


15See formula (16.3) on p. 388. 
16Or orthogonal group of B. 
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16.2.5 Canonical Operator 


The operator x = B !B* : V — V corresponding to the left correlation * 
under the isomorphism (16.19) is called the canonical or Serre operator of the 
nondegenerate bilinear form f. It is uniquely determined by the condition 


Vu,weV, B(w,u) = Blu, xw). (16.20) 


We write xg for the canonical operator of 6 when a precise reference to f is 
required. In every basis of V, the matrix K of the canonical operator x is expressed 
through the Gramian B of the form B as K = B"'B’. 


Exercise 16.11 Check that under a transformation of a Gramian by the rule 
Br C'BC 


for some C € GL,(k), the matrix K = B™'B' of the canonical operator is 
transformed as K +> C7!KC. 


Therefore, equivalent nondegenerate bilinear forms have similar canonical opera- 
tors. Over an algebraically closed field of zero characteristic, the converse statement 
is true as well. In Theorem 16.1 on p.401 below we will prove that over such a 
ground field, two nondegenerate bilinear forms are equivalent if and only if their 
canonical operators are similar. Thus, the classification of nondegenerate bilinear 
forms over an algebraically closed field of zero characteristic is completely reduced 
to the enumeration of all collections of elementary divisors €() that the canonical 
operator of a nondegenerate bilinear form can have. We will do this in Theorem 16.3 
on p. 407. Now let us mention some obvious constraints on €£ (2). 

Since B(u, w) = B(w, xu) = B(2xu, xw) for all u, w € V, the canonical operator 
is isometric. Up to a nonzero constant factor, the characteristic polynomial of the 
canonical operator 


Xx(t) = det(tE — B~'B') = det~'(B) - det(tB — B') = det'(B) - xg (1,9) 


coincides with the characteristic polynomial!’ y(t, t1) of the form B expressed 
through the local affine coordinate t = t/to on Pj = P(Ig). Since 0, 00 are 
not among the characteristic values of a nondegenerate form, the eigenvalues of 
% coincide with the characteristic values of 6 regarding their multiplicities. In 
particular, all eigenvalues of the canonical operator different from +1 split into 
inverse pairs A, A~! of equal multiplicities. 


Example 16.5 (Nondegenerate Form of Type W,(A)) For everyn € N anddA é€ 
k* = k~ 0, write W,,(A) for the even-dimensional coordinate space k”” equipped 


See Sect. 16.1.7 on p. 392. 
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with a bilinear form 6 whose Gramian in the standard basis has block form 


tf US 
B= ex |) ; (16.21) 


where E,, and J,,(A) are the n x n identity matrix and the Jordan block!® of size n. 
The bilinear form 6 is nonsymmetric, nondegenerate, and has canonical operator 


Kame =(? ae ) = Ce 0 ) 
7 AE, 0 E, 0 ) 0 SA)’ 


Since the matrices J7!(A) and J‘(A) are similar! to J,(A7') and J,(A) respectively, 
we conclude that 


El(x) = {(t— A)", (t-A')"} , where A # 0. 


In other words, the Jordan normal form of x consists of two n x n blocks with 
nonzero inverse eigenvalues. 


Example 16.6 (Nondegenerate Form of Type U,) For n € N, write U, for the 
coordinate space k” equipped with a bilinear form 6 whose Gramian in the standard 
basis is 


(16.22) 


(-1)""? (yr 
(-1)""! (-1)""? 


(alternating +1 on the secondary diagonal and strictly below it; all other elements 
vanish). The canonical operator K = B~'B' of this form is equal to 


(-1)""? (0 ioe (-1)""! 
(-1)"? (-1)""? (-1)"? (-1)""? 


-11 i. a 


'8See formula (15.9) on p. 375. 
'9See Exercise 15.21 on p. 383 and Exercise 15.6 on p. 365. 
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=(-1". 


and can be written as (—1)""'E + 2N + N?, where N is a nilpotent operator with 
N" = Oand N"! 4 0. If chark 4 2, then M = 2N + N° is also nilpotent with 
M" = O and M""! Z 0. Therefore, the Jordan normal form of K has exactly one 
block J, ((—1)""') of size n with eigenvalue —1 for even n and +1 for odd n. 


16.3 Adjoint Operators 


Let V be a vector space with nondegenerate bilinear form £. Then associated with 
every linear operator f : V > V are the right adjoint operator fY : V — V and left 
adjoint operator fY : V — V. The first of these is determined by the rule 


Vu,weV, B(fu,w) = Blu,fYw) or f*B = BY, (16.23) 
which means commutativity of the following diagram: 


f* 


y* —~v* 


1 


—aas 4 (16.24) 


ie., fY = B'f*B. The second operator Yf : V — V is determined by the 
symmetric prescription 


VuweV, B(“fu,w) = Blu,fw) or (“f)* B= Bf, (16.25) 


meaning the commutativity of the diagram 


V ——>V (16.26) 
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ie., “f = (B*)”' f*B*. In every basis of V, the matrices YF, FY of the adjoint 
operators are expressed in terms of the matrix F of f and Gramian B of 6 as 


YF = (B')' F'B’ and FY =B™'FB. (16.27) 


Exercise 16.12 Show that “(fY) = f = (“f)Y. 


Both adjunction maps f + fY, f + Vf are antihomomorphisms with respect to the 
composition, 


(fg) = gYfY and (fg) = “s*f. 
because B(fgu.w) = B(gu.f’w) = Blu.g’f’w), Bu.fgw) = B(“fu.gw) = 
B(“g fu, w). 
Exercise 16.13 Show that the operator g : V — V is isometric if and only if g is 


invertible and Vg = g¥ = g™!. 


Proposition 16.3 (Reflexivity) For every vector space V with nondegenerate bilin- 
ear form B and linear operator f : V — V, the following properties are 
equivalent: 


(1) fXY =f, 
(2) “f=f, 
(3) “f=f™, 
(4) xpf = fp, 


where xg = B~!B* : V > V is the canonical operator” of B. 


Proof By Exercise 16.12, the right adjunction of both sides in (3) leads to (1), from 
which (3) can be reobtained by left adjunction of both sides. For the same reason, 
(3) and (2) are equivalent as well. Property (3) can be written as (B*)'f*p* = 
B-'f*B, which is equivalent to the equality 6 (B*)"'f* = f*B(B*) | between 
the operators on V*. For the dual operators on V, we get the equality fB-'B* = 
B—' B*f, which is (4). Oo 


16.3.1 Reflexive Operators 


A linear operator f : V > V is called reflexive with respect to a bilinear form 8 
if it satisfies the equivalent conditions from Proposition 16.3. If a form f is either 
symmetric or skew-symmetric, then all linear endomorphisms of V are reflexive. 
For a nonsymmetric form #, the reflexive operators form a proper k-subalgebra in 


See Sect. 16.2.5 on p. 397. 
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End(V), the centralizer of the canonical operator x, 


Z (xp) = {f € End(V) | fxg = xpf}. 


Restricted to Z(2), the conjugation map f + fY = Vf becomes a linear involution. 
Therefore, Z(x) splits into a direct sum of +1 eigenspaces 


Z (xp) = Z4(x) @ Z_(x), where Zz (x) 2 {f : VV |fY =H4f}. 


An operator f : V > V is called self-adjoint (respectively anti-self-adjoint) if fY = 
f (respectively fY = —f). Equivalently, a self-adjoint operator f (respectively anti- 
self-adjoint operator q) is defined by the prescription 


Vu,weV B(fu,w) = B(u,fw) (respectively B( fu, w) = —B(u,fw)). 


All (anti) self-adjoint operators are clearly reflexive. Therefore, the Z4(%) C Z(x) 
are exactly the subspaces of all (anti) self-adjoint operators. Every reflexive operator 
f €Z(x) = Z4(*) ® Z_(x) uniquely splits as f = f, + f- for 


fe SF 4PY)/2 € Z4(00 and f_ # (f —f”)/2 € ZC). 


Lemma 16.1 Nondegenerate bilinear forms a, B on V have equal canonical 
operators %q = xg if and only if a = Bf for some linear operator f : V > V 
that is self-adjoint with respect to both forms. 


Proof Let the canonical operators be equal, i.e., B-'B* = a ~!a*. Dual to this 
equality is 6 (B*)~' = a (a*)~'. The operator f = B-'a = (6*) | a* commutes 
with %, = xg, because fx, = B-'a* = xgf, and satisfies « = Bf. Conversely, 
if a = Bf and f is self-adjoint with respect to B, ic., f = fY = B'f*B, then 
Xo = a la* agp =f ape = pB* = xp. g 
Theorem 16.1 Let the ground field k be algebraically closed of zero characteristic. 
Then two nondegenerate bilinear forms are equivalent if and only if their canonical 
operators are similar. 


Proof Given a linear automorphism g : V > V such that a = g* Bg, then 


Mo = a ta* = g |B 'B*g — gpg P 
Conversely, let a and 6 have similar canonical operators x, = g xg g. Let us 
replace B by the equivalent correlation g* Bg. This allows us to assume that xg = 
%q. Then by Lemma 16.1, a(u,w) = B(u,fw) for some nondegenerate operator 
f: V= V that is self-adjoint with respect to 8. In Lemma 16.2 below, we construct”! 


*! Under the assumption that the ground field is algebraically closed and has zero characteristic. 
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a polynomial p(t) € k[t] such that the operator h = p(f) satisfies h> = f. The 
operator h is also self-adjoint with respect to B, because hY = p(f)Y = p(fY) = 
p(f) = fh. Since a(u,w) = Blu, fw) = B(u, h?w) = B(hu, hw), the forms a and B 
are equivalent. Oo 


Lemma 16.2 Over an algebraically closed field k of characteristic zero, for every 
finite-dimensional nondegenerate operator f there exists a polynomial p(t) € k{t] 
such that p(f)* = f. 


Proof Realize f as multiplication by f in the direct sum of residue modules of type 
k[4/(t — A)”, where A 4 0 by the assumption of the lemma. For each 1, write m 
for the maximal exponent of binomials (t — A)” appearing in the sum. Put s = t—A 
and consider the first m, terms of the formal power series expansion” 


1 A-1/2 47 
Jt = VA Sey Ayah? t9— r+i—s---, 


8 16 


where A!/? € k is either of the two roots of the quadratic equation x7 = 2. Write 
p,(t) for this sum considered as a polynomial in t. Then p;(t) = ¢ (mod (t—A)™). 
By the Chinese remainder theorem, there exists a polynomial p(t) € k[#] such that 
p(t) = pa(t) (mod (t — A)"*) for all A. Then p? = t(mod(t—A)™) for each A € 
Spec(f), that is, p?(f) = f. o 


Remark 16.2 It follows from Theorem 16.1 that over an algebraically closed field 
of zero characteristic, for each n € N there exist a unique, up to equivalence, 
nondegenerate symmetric bilinear form of dimension nm and a unique, up to 
equivalence, nondegenerate skew-symmetric bilinear form of even dimension 2n. 
Simple direct proofs of even stronger versions~’ of these claims will be given in 
Corollary 16.4 on p.410 and Theorem 16.5 on p. 410 below. 


Remark 16.3 If k is not algebraically closed, Theorem 16.1 is not true even 
for symmetric forms. For example, over Q there is a vast set of inequivalent 
nondegenerate symmetric forms, and its description seems scarcely realizable at 
this time. The isometry classes of the symmetric bilinear forms over the fields R 
and F,, will be enumerated in Sect. 17.3 on p. 431. 


See formula (4.29) on p. 86. 


23 Assuming only the constraint chark * 2 on the characteristic and algebraic closure in the first 
case and for an arbitrary ground field in the second. 
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16.4 Orthogonals and Orthogonal Projections 


16.4.1 Orthogonal Projections 


Let U be a subspace of a vector space V with bilinear form B : V x V > k. We 
write 


+U ={veEV|VueU Blv,u) = Of, (16.28) 
Ut = {veV| Vue U B(u,v) = 0}, (16.29) 


for the left and right orthogonals to U. For symmetric and skew-symmetric forms, 
these two orthogonals coincide, whereas for a nonsymmetric form, they are usually 
distinct. 


Proposition 16.4 Jf dim V < oo and B is nondegenerate, then for every subspace 
U CV, we have 


dim+U = dimV—dimU = dimU+ and (tu)+}=U=1t (U4). 


Proof The first pair of equalities hold because the left and right orthogonals are the 
preimages of the same space Ann U Cc V* of dimension*™* dim Ann U = dim V — 
dim U under the linear isomorphisms 6*, 6 : V + V*. Since U is a subspace of both 
spaces (tu), +(U+) and all three spaces have the same dimension, the second pair 
of equalities holds. oO 


Proposition 16.5 Let V be an arbitrary~ vector space with bilinear form B, and 
let U C V be a finite-dimensional subspace such that the restriction of B to U 
is nondegenerate. Then V = Ut ® U = U @1U, and for every vector v € V, 
its projections vy and yv on U respectively along Ut and along +U are uniquely 
determined by the conditions B(u,v) = B(u, vy) and B(v,u) = B(yv,u) for all 
u € U. Fora basis uy, u2,...,Um in U, these projections can be computed as 


vy = > B (Yur) + uj and yveU=) > B(v,uy)-uj, 


where ‘u; and uy are taken from the left and right dual bases”° in U. 


Proof For every v € V, the existence of the decomposition v = w + vy with w € 
U+, vy € U means the existence of a vector vy € U such that B(u, v) = B(u, vy) 
for all u € U, because the latter equality says that v — vy € U+. The uniqueness of 
the decomposition v = w + vy means that the vector vy is uniquely determined by 
v. If the restriction of 6 to U is nondegenerate, then for every v € V, the linear form 


4See Proposition 7.3 on p. 162. 
>5Not necessarily finite-dimensional. 
©See formula (16.17) on p. 395. 
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B(v): Ut k, ut Blu, v), is uniquely represented as an inner product from the 
right with some vector vy € U, which is uniquely determined by v. We conclude 
that for every v, there exists a unique vy € U such that B(u,v) = B(u, vy) for all 
u € U, as required. This proves that V = UL ®U. Since B(Yu;, vy) = B(“u;, v), the 


expansion of the vector vy through the basis uw, u2,..., u, in U by formula (16.18) 
on p. 395 is vy = 9° B (uj, vy) + u; = > B (uj, v) - u;. For the right orthogonal, 
the arguments are completely symmetric. Oo 


Corollary 16.1 Let V be a finite-dimensional vector space with nondegenerate 
bilinear form B and let U C V be a subspace such that the restriction of B to U is 
nondegenerate as well. Then the restrictions of B to both orthogonals Ut, +U are 
nondegenerate too. For every vector v € V, its projection giv € U+ along U in the 
decomposition V = U+ @ U is equal to v—vy € Ut and is uniquely determined by 
the equality B(v,w") = B(y.v, w") for all w" € U+. Symmetrically, the projection 
vi € i of v onto tu along U in the decomposition V = U ® “0 is equal to 
v — yv and is uniquely determined by the equality B(w',v) = B(w’, vis) for all 


w e+. 


Proof For every w’ € Ut there exists some v € V such that B(v,w”) # 0. We 
use the decomposition V = U+ @ U from Proposition 16.5 to write it as v = 
yLv + vy, where yiv = v — vy € Ut. The orthogonality relation B(vy, w”) = 0 
forces B(yiv,w”) = B(v,w’) # 0. This proves that the restriction of B to U+ is 
nondegenerate and B(v,w") = B(y.v,w”) for all w” € U+. By Proposition 16.5 
applied to U+ in the role of U, the vector ,..v is the projection of v onto U+ along 
+(u+) = U in the direct sum V = +(U+) @ Ut = U @ UL. The case of the left 
orthogonal is completely symmetric. Oo 


Corollary 16.2. Under the assumptions of Corollary 16.1, the restrictions on Ut 
and +U of the two projections along U in the decompositions 


U@tU=V=U'6U 
establish the inverse isometric isomorphisms 
A:Uts*U, wrewi, =w—wu, (16.30) 
ee oe Ze = 
Qo: UsU-, Wis. WS w= pW, (16.31) 


For all v € V, the projections Viy and yu along U are sent to each other by these 
isometries. 


Proof The projection (16.30) is isometric, because for all w’, w” € Ut, 
Bw wi) = B(w' — yw’, w" — pw") 
= B(w',w") — B(uw’, w") — Bw’ — ow’, yw") 


= Bw’, w") — B(uw',w") — BWA» uw’) = Bw',w"). 
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The isometric property of the projection (16.31) is verified similarly. Since both 
projections v > yv, vb vy act identically on U, for every v € V we have 


(yLv)iy, = (u-vu)1, = v—vy—y(v — vy) = v—-vy—yU+ by = V—-y = Via 


A similar computation verifies the equality i M1,,) = yu. Thus, both A and g are 


isometric and swap the two projections Vly € +U and yiv € Ut of every vector 
v € V. This forces A, @ to be inverse to each other. Oo 


16.4.2 Biorthogonal Direct Sums 


Subspaces U,W C V are called biorthogonal with respect to a bilinear form 6 
if B(u,w) = 0 and B(w,u) = 0 for all u € U,w € W. A form 8 is called 
decomposable if V splits into a direct sum of two nonzero biorthogonal subspaces. 
Otherwise, 6 is called indecomposable. If B is nondegenerate and V = U @ W, 
where U,W C V are biorthogonal, then both restrictions of 6 onto U, W are 
forced to be nondegenerate, and the Gramian of 6 in any basis compatible with 
the decomposition V = U © W takes the block diagonal form 


By 0 
cs ee 
0 Bw 
where By, By are the Gramians of the restricted forms By, Bw. Therefore, the 
canonical operator of 8 also decomposes into a direct sum of canonical operators of 


the restricted forms on the summands. Every finite-dimensional space with bilinear 
form is certainly a direct orthogonal sum of indecomposable spaces. 


Exercise 16.14 Check that the 2k-dimensional space W; (A) from Example 16.5 
on p.397 for A = (—1)*! decomposes into a biorthogonal direct sum of two 
subspaces U,. 


16.4.3 Classification of Nondegenerate Forms 


Everywhere in this section we assume by default that the ground field k is 
algebraically closed of zero characteristic. For a linear operator f : V > V, we say 
that a sequence of vectors 0 = uo, uy, U2,...,Ug € Vis a Jordan chain’ of length 
k with eigenvalue A for f if f(u,) = Auy + uy for all 1 < v < k. This means that 


°7See Sect. 15.2.1 on p. 367. 
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the linear span U of vectors the u1, u2,..., ux is f-invariant and the matrix of f|y in 
the basis 1, u2,..., uz is the Jordan block?® JA). 


Lemma 16.3 Let f : V > V be an isometry of an arbitrary nondegenerate bilinear 
form B on V and let 


O = uo, Uy, U2,..., Ue, O = Wo, W1,W2,.--,Wm, 


be two Jordan chains for f with eigenvalues 4 and \t respectively. If Au 4 1 or 
£ & m, then these chains are totally biorthogonal. If Af4 = | and £ = m, then the 
biorthogonality relations B(u;, wj) = B(wi, uj) = 0 hold for alli+ j<=m. 


Proof Since B(uj, wi) = B(xuj, xwj) = B(Auit+u-1, Hw; +w;-1), for all 1 <i < 
and | < j < m we have the following recurrence relation: 


ad = Ap)p (ui, wj) = AB (uj, Wj-1) + Lp (uj-1, wj) + B (ui-1, wj-1) y (16.32) 


For Aj # 1, this implies by increasing induction” on i + j that B(u;, w;) = 0 for 
all i, j. For Aw = 1, the equality (16.32) becomes the relation 


AB (uj, wj-1) + pp (uj-1, w;) + B (uj—1, wj-1) = 0. 


The same increasing induction on i + j shows that for every fixedi+j = k < 
max(£,m), the equality 16 (ui, wj-1) = —pp (u;-1, w;) holds for all i, 7 with i + 
j = k. If € = m, then increasing induction on i shows that B(u;,w;) = 0 for all 
i+j=k <.Foré < m, increasing induction on j shows that B(u;, w;) = 0 for all 
itj=k<m. oO 


Theorem 16.2 Let f : V > V be an isometry of a nondegenerate indecomposable 
bilinear form B on V. Then El(f) consists either of k binomials (t—1)” with common 
m = dim V/k or of k binomials (t + 1)” with common m = dim V/k or of k pairs of 
binomials (t—A)", (t—A~!)" with common m = dim V/(2k) and common A # +1. 


Proof Fix some Jordan basis e for f, and for each A € Specf, write U, for the linear 
span of all Jordan chains of maximal length with eigenvalue A in e. By Lemma 16.3, 
V splits into a direct biorthogonal sum v = W @ W’, where W = U) + Uj-: and 
W’ is the linear span of all the other Jordan chains in e, which either have different 
eigenvalues or are shorter. Since § is indecomposable, we conclude that W’ = 0, 
V =U, + Uj-1, and Spec(f) = {A,A7!}. If A = £1, then V = U, = Uj-1, and 
the theorem holds. If A # £1, then V = U, ® U,-1, and Lemma 16.3 forces both 
summands to be isotropic and to have the same length of Jordan chains spanning 
them. By Proposition 16.2 on p. 396, each summand has dimension at most 5 dim V. 
This forces dim U, = dim U,-1 = dim V. Therefore, all Jordan chains in e can 
be combined in pairs with equal lengths and inverse eigenvalues. Oo 


8See Corollary 15.11 on p. 376. 
©The induction begins with i + j = 0 and uy = wo = 0. 
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Theorem 16.3 Over an algebraically closed field of zero characteristic, a finite 
collection of possibly repeated binomials (t—A)" with some A € k, m € N is realized 
as a collection of elementary divisors of the canonical operator of a nondegenerate 
bilinear form if and only if this collection satisfies the following four conditions: (1) 
all 4 are nonzero; (2) all divisors with A # +1 are split into disjoint pairs with 
equal exponents m and inverse roots 4, 4—'; (3) for each even m, the number of 
divisors (t— 1)” is even; (4) for each odd m, the number of divisors (t + 1)" is even. 


Proof Let a collection of elementary divisors (t — A)” satisfy conditions (1)-(4). 
Then it consists of some disjoint pairs of the form (t — A)”, (t — A7!)”, where 
repeated pairs are allowed, and some unpaired divisors (t + 1)**, (t— 1)***! whose 
exponents do not repeat. Each pair (t—A)”, (t—A~!)” can be realized as a collection 
of elementary divisors of the canonical operator on the space*’ W,, (A). Each divisor 
(t + 1) (respectively (t — 1)?**!) can be realized as an elementary divisor of the 
canonical operator on the space?! Ux (respectively Ur,+1). The total collection is 
realized in the biorthogonal direct sum of these spaces. Thus, conditions (1)—(4) are 
sufficient. 

Let us verify that they are necessary. It is enough to check that conditions (1)-(4) 
hold for every indecomposable nondegenerate form. Since the canonical operator is 
isometric, we can apply Theorem 16.2. It says that Spec x is ether {+1} or {—1} or 
{2,A7!} for some A # +1, and in the latter case, the elementary divisors of x split 
into pairs with equal exponents and inverse roots. Therefore, conditions (1), (2) are 
necessary. Now assume Spec x = {e}, where e = +1. By Theorem 16.2, the Jordan 
basis of % consists of k Jordan chains of the same length m. Therefore, x = eld+ 7, 
where 7 is a nilpotent operator with 7” = 0 and 


dimim 7”! = dim(V/ ker ”~!) = k. 


Let W = V/kern!. The orthogonality relations from Lemma 16.3 on p.406 
imply that im 7”! is biorthogonal to ker n”"!. 


Exercise 16.15 Check this. 


Hence, there is a bilinear form a on W well defined by a([u], [w]) & B(u. n”""!w), 
where u, w € V and [u], [w] € W mean their classes modulo ker 7/”—!. This form is 
nondegenerate, because for every class [w] 4 0, the vector 7’"—'w is nonzero, and 
therefore B(u, 7~!w) # 0 for some u € V. On the other hand, im 7’""! = kern = 
ker(2 — eld) consists of eigenvectors of x, which all have eigenvalue ¢. Therefore, 
the form a can be written as 


o((u], [w]) = Bu. nw) = BO” 'w, uw) = eB(n”'w, uw). 
The operator adjoint to 7 is nY = (x — eld)Y = x7! — eld = (eld +n)! — eld = 
E ((Id + en)! _ Id) = —n+en?—en? fee, which has (nv) = (-1)" 17-1, 


See 16.21 on p. 398. 
31See 16.22 on p. 398. 
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Hence 
m— m—1 
oy([u), [w)) = #8 (wu) = eB (w,(nY)" 'w) 
= (-1)" 'eB (w, n”™ 1u) = (-1)""'ea(w, uw). 
Thus for « = (—1)”, the nondegenerate form @ on W is skew-symmetric. This 


forces dim W to be even by Exercise 16.9 on p. 394. Hence the numbers of Jordan 
chains with even length m and eigenvalue +1 and with odd length m and eigenvalue 
—1 both should be even; that is, conditions (3), (4) are necessary too. oO 


Corollary 16.3 Over an algebraically closed field of zero characteristic, all inde- 
composable finite-dimensional vector spaces with nondegenerate bilinear forms are 
exhausted up to isometry by the n-dimensional spaces U,, n € N, from Example 16.6 
on p.398, and the 2n-dimensional spaces W,(A), n € N, A # (-1)""!, from 
Example 16.5 on p. 397. All these forms are inequivalent. oO 


16.5 Symmetric and Skew-Symmetric Forms 


In this section, k means any field of any characteristic. Recall*” that a bilinear form 
B on V is called symmetric if B(v, w) = B(w, v) for all v, w € V. This is equivalent 
to the conditions xg = Idy, B* = B, and B’ = B, where B is any Gramian of £. 

A bilinear form is called skew-symmetric if B(v,v) = 0 for all v € V. This 
definition implies what was given in Sect. 16.1.6 on p. 391, because the equalities 


0 = But+w,ut+w) = Blu,u) + Bw, w) + Blu, w) + Bw, u) = B(u,w) + BOw, u) 


force B(u, w) = —B(w, u). If chark 4 2, then both definitions are equivalent, and 
they mean that x3 = —Idy, B* = —6, and B’ = —B, where B is any Gramian of 
B. However, for chark = 2, the condition B(u, w) = —B(w, u) becomes B(u, w) = 
B(w, u), whereas the condition B(v, v) = 0 becomes more restrictive: it says that the 
Gramian B of a skew-symmetric form is symmetric with zeros on the main diagonal. 
In other words, for chark = 2, the involution 6 +> * is not diagonalizable but 
is similar to multiplication by ¢ in k[#]/ ((¢ + 1)*). The symmetric forms form the 
kernel of the nilpotent operator B > 6 + B*, whereas the skew-symmetric forms 
form the image of this operator. 


32See Sect. 16.1.6 on p. 391. 
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16.5.1 Orthogonals and Kernel 


Let 6 be either symmetric or skew-symmetric. Then the left and right orthogonals 
to every subspace U C V coincide. This two-sided orthogonal is denoted by 


Ut ={weV|VueU Blu,w) = 0} 


and called just the orthogonal to U. In particular, the left and right kernels of 6 
coincide: 


vt =vt = {wev|VueV Bv,w) = 0}. 


We call this space just the kernel of 6 and denote it by ker . 


Proposition 16.6 For every subspace U Cc V, complementary** to ker B the 
restriction to U of the (skew) symmetric form B is nondegenerate. 


Proof Let u € U satisfy B(u,w) = 0 for all w € U. Since V = U @ ker®, 
we can write every v € Vas v = w+e, where w € U ande é€ kerf. Then 
B(u, v) = B(u, w) + B(u, e) = 0 for every v € V. Hence, we UNkerB=0. O 


Caution 2 For a nonsymmetric bilinear form, Proposition 16.6 formulated for a 
one-sided kernel, whether a left kernel or a right kernel, is false. 


16.5.2. Orthogonal Projections 


If the restriction of a (skew) symmetric form f to a subspace U C V is nondegener- 
ate, then V = U@U+ by Corollary 16.1. In this case, the subspace U+ is called the 
orthogonal complement to U. The projection of the vector v € V onto U along Ut 
is denoted by zryv and called the orthogonal projection. It is uniquely determined 
by the relation B(u,v) = B(u, myv) for all u € U. If a (skew) symmetric form 6 
is nondegenerate on all of V, then by Proposition 16.5, dim U+ = dim V — dimU 
and U++ = U for all subspaces U C V. By Corollary 16.1, the restriction of a 
nondegenerate form f onto a subspace U C V is nondegenerate if and only if the 
restriction of 6 onto U+ is nondegenerate. 


Theorem 16.4 (Lagrange’s Theorem) Let 8 be a symmetric bilinear form on a 
finite-dimensional vector space over an arbitrary field k with chark 4 2. Then B 
has a diagonal Gramian in some basis. 


Proof Induction on dim V. If dim V = 1 or if B is zero, then the Gramian of 6 is 
diagonal in every basis. If 8 4 0, then there exists some vector e € V such that 


33That is, such that V = U @ ker B. 
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B(e, e) 4 0, because otherwise, 


B(u,w) = (Blu + w,u + w) — Blu, u) — Bw, w))/2 = 0 


for all u, w € V. Since the restriction of 6B to a 1-dimensional subspace U = k- e 
is nondegenerate, V = U @ Ut. By the inductive hypothesis, there is a basis with 
diagonal Gramian in U+. Attaching the vector e to this basis, we get the required 
basis in V. oO 


Corollary 16.4 Two symmetric bilinear forms over an algebraically closed field k 
with char(k) 4 2 are equivalent if and only if they have equal ranks.** 


Proof Over an algebraically closed field, every nonzero diagonal element of the 
Gramian is equal to 1 after dividing the corresponding basis vector e; by ,/(é;, €;). 
oO 


Theorem 16.5 (Darboux’s Theorem) Over an arbitrary field,» every _finite- 
dimensional vector space V with nondegenerate skew-symmetric form @ is equiva- 
lent to a symplectic space.*° In particular, dim V is necessarily even. 


Proof We will construct a basis e;,é2,...,@2, in V with block diagonal Gramian 
formed by 2 x 2 blocks 
0 1 
: 16.33 
(ca) (16.33) 


After that, we reorder the basis vectors by writing first all vectors e; with odd i and 
then all vectors e; with even i. This gives the symplectic Gramian 


( OE 

“20) 

Let e; € V be an arbitrary vector. Since w is nondegenerate, there exists a vector 
w € V such that w(e;,w) = a # 0. We put eo = w/a. Then the Gramian of 
(€1, €2) has the required form (16.33). Write U C V for the linear span of e1, eo. 
Since the Gramian of (e€;, e2) has nonzero determinant, the vectors e,, é2 are linearly 
independent. Thus, dim U = 2, and the restriction of w to U is nondegenerate. 
Therefore, V = U @ Ut, and the restriction of w to U+ is nondegenerate as well. 
Induction on dim V allows us to assume that U+ has a basis with the required block 
diagonal Gramian. We attach e1, e to this basis and get the required basisin V. O 


34See Definition 16.1 on p. 391. 
35Of any characteristic. 
3©See Example 16.3 on p. 393. 
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16.5.3 Adjoint Operators 


Since the canonical operator of a nondegenerate (skew) symmetric form 6 belongs 
to the center of the algebra End V, every linear operator f : V > V is reflexive*” for 
B. We write fY = B~'f* B for the (two-sided) operator adjoint to f. For all u, w € V, 
it satisfies the equalities B(u,f’w) = B(fu, w) and B(f’u, w) = B(u, fw), which are 
equivalent to each other if 6 is (skew) symmetric. If chark 4 2, then the involution 
f t» f” is diagonalizable, and End(V) = End;(V) @ End_(V), where the +1- 
eigenspaces 


End,(V) “& {y € End(V) | fY =f} 
End_(V) = {g € End(V) | fY = —f} 


consist of self-adjoint and anti-self-adjoint operators, which satisfy for all u, w € V 
the equalities B( fu, w) = B(u, fw) and B(fu, w) = —B(u, fw) respectively. 


16.5.4 Form—Operator Correspondence 


For a nondegenerate symmetric form f on V, the linear isomorphism 

End(V) + Hom(V,V*), ft Bf, (16.34) 
from Sect. 16.2.4 on p.396 establishes linear isomorphisms between (anti) self- 
adjoint operators and (skew) symmetric correlations. For a nondegenerate skew- 


symmetric form f, the isomorphism (16.34) takes self-adjoint operators to skew- 
symmetric forms and anti-self-adjoint operators to symmetric forms. 


16.6 Symplectic Spaces 


We write Q2, for the 2n-dimensional symplectic space considered up to isometric 
isomorphism. A convenient coordinate-free realization of (2., is provided by the 
direct sum 


W=U6OUW*, dimU=n, (16.35) 


37See Sect. 16.3.1 on p. 400. 
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equipped with the skew-symmetric form 


@ ((uy, 1), (v2, &)) = (ur, &) — (ua, &). (16.36) 
For every pair of dual bases e1, @2,...,@, and e*,,e*2,...,e*, in U and U%*, the 
vectors 
At. nn or 


form a symplectic basis in W; i.e., their Gramian is 


(08 
J= (2. :) (16.37) 


Exercise 16.16 Check that Qa, ® Qo, is isometric to Qom+x). 


16.6.1 Symplectic Group 


The isometries F : W — W of a nondegenerate skew-symmetric form @ on a 2n- 
dimensional space W are called symplectic operators. They form a group, denoted 
by Sp,,(W) and called the symplectic group of W with respect to the form w. The 
matrices of symplectic operators written in an arbitrary symplectic basis of W form 
the group of symplectic matrices 


Sp,,, (k) = {F € Mata, (k) | F'-J-F = J}. 


In the coordinate-free realization (16.35), every operator F on W = U @ U* can be 


written in block form 
Fe= AB 
CD 


where A: U > U, B: U* > U,C: U > U*, D: U* > U*. Then the 
symplectic condition F’ - J. F = J becomes an equality: 


0 F\_ f(A'C\(0 E\(AB\ _ (-C'A+A'C -C'B+A'D 
-E0) \BD')\-E0/\CD) \—D'A+ B'C —D'B+ B'D) ’ 
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which is equivalent to the relations C'A = A‘'C, D'B = B'D, E+ C'B = A'D. 
These show that there is an injective group homomorphism 


GL(U) > Sp,(U* ®U), Grh (¢ eA) : 


16.6.2 Lagrangian Subspaces 


Let V be a symplectic space of dimension 2n with symplectic form w. The isotropic 
subspaces L C V of maximal dimension dimL = n are called Lagrangian 
subspaces. 


Proposition 16.7 Every isotropic subspace U C V is contained in some symplectic 
subspace W C V of dimension dim W = 2 dim U. Every basis of U can be extended 
to some symplectic basis of W. 


Proof Chose a basis uj, u2,...,Um in U, extend it to some basis in V, and write 
uy ,uy,..., uy, for the first m vectors of the dual basis with respect to w. Therefore, 
1 fori=j, 
w (u;,u") = d (16.38) 
0 fori ¥/. 


These relations will remain valid when we add to each uy a linear combination of 
vectors u;. Let us replace each uy by a vector 


wy =uy — Yow (uy uy) -uy. (16.39) 
v<j 
Then we get vectors w1,W2,...,Wm Satisfying the same orthogonality rela- 


tions (16.38) and spanning an isotropic subspace, because for all 1 < i,j < m, 
o(w;, wj) = o(u;, u;) — o(uy,u’)+o(uY,u;) = 0. 


Therefore, the vectors u; and w; for 1 < i,j < m form a symplectic basis in their 
linear span. It remains to denote the latter by W. oO 


Theorem 16.6 For every Lagrangian subspace L C V, there exists a Lagrangian 
subspace L' C V such that V = L® L’. For every basis e in L, there exists a unique 
basis e’ in L’ such that the 2n vectors e, e’ form a symplectic basis in V. For fixed 
L’, all Lagrangian subspaces L" complementary to L are in bijection with the linear 
maps f : L' + L that are anti-self-adjoint with respect to w. 


Proof \f we repeat the proof of Proposition 16.7 for U = L and (uy, u2,...,Um) = 
e, then we get W = V = LOL’, where L’ is spanned by the vectors w; from (16.39). 
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This proves the first statement. To prove the second, we note that the correlation 
wo:V> V*,v & @(x,v), sends L’ to the subspace w(L’) C V*, which is 
isomorphic to L*. Let us identify w(L’) with L* by means of this isomorphism. 
Then the basis w1,w2,..., Wy in L’, which extends e to a symplectic basis in V, is 
uniquely described as the preimage of the basis e* in L* dual to e. This proves the 
second statement. Every subspace L” C V = L®L’ complementary to L is mapped 
isomorphically onto L’ by the projection along L. This means that for every w € L’, 
there exists a unique vector f(w) € L such that w + f(w) € L”. The assignment 
w +> f(w) produces a linear map f : L’ — L, whose graph is L”. Since both 
subspaces L, L’ are isotropic, for all w,, w2 € L’ (with f(w1), f(w2) € L), the equality 
w (wi +f(w1), w2 +f(w2)) = @ (wif (4) + © (£01), w2) holds. Therefore, the 
subspace L”, the graph of f, is isotropic if and only if for all w1, w2 € L’, 


w (wi. f(w2)) = —o (f(w1), 2) . 


which means that the map / is anti-self-adjoint with respect to w. oO 


16.6.3 Pfaffian 


Consider the elements {aj}i<; of a skew-symmetric (2) x (2n) matrix A = (aj) 
above the main diagonal as independent commutative variables and write Z[a,] for 
the ring of polynomials with integer coefficients in these variables. We are going to 
show that there exists a unique polynomial Pf(A) € Z[aj] such that Pf(A)? = det(A) 
and Pf(J’) = 1, where J’ is the block diagonal matrix constructed from n identical 


2 x 2 blocks 
0 1 
-10) 


The polynomial Pf(A) is called the Pfaffian of the skew-symmetric matrix A. We 
will show that it has the following explicit expansion through the matrix elements: 


Pf(A) = s sgn(ijrirjr ees inja) * Giyj, Gin jo °° * Gin jn > (16.40) 


{iyi JU Lin ind = 
={1,2,...,2n} 


where the summation ranges over all decompositions of the set {1, 2, ... , 2m} into 
a disjoint union of n pairs {i,,j,} with i, < j,, and sgn(g) means the sign of the 
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permutation g € S>,. For example, 


0 abe 
0a —a0 de 
Pf = Pf = af—b d. 
Co ia pope 
—c —e —-f 0 


Itis convenient to put ay = —a;; for i > j and consider {i,, j,} in (16.40) as unordered 
pairs. 


Exercise 16.17 Verify that: 


(a) The right-hand side of (16.40) is unchanged under transpositions within pairs 
(i, sv). 
(b) sgn (ij, i2j2 ...injn) does not depend on the order of the pairs. 


Now we begin the construction of ./detA. Consider a matrix A, the Gramian of 
a skew-symmetric form @ on the coordinate vector space K” over the field K = 
Q(aj) of rational functions in aj with coefficients in Q. Since w is nondegenerate, 
it follows from Theorem 16.5 that there exists a basis e1, ef, €2, €5,--., €n, @F in 
K?" with Gramian J’. Therefore,A = C-J’-C' for some matrix C € GL», (K). Hence, 
det(A) = det(C)*, because detJ’ = 1. It remains to check that det C coincides 
with the right-hand side of (16.40). To this end, we introduce another collection 


of commuting independent variables {bj}i<i<j<n, put bj = — bj for i > j, and 
organize them all into a skew-symmetric matrix B = (bij). Consider the ring of 
Grassmannian polynomials in the skew-commuting variables § = (&, &,...,&) 


with coefficients in the (quite huge) commutative ring K[b,] and define there a 
homogeneous Grassmannian polynomial of degree 2 by 


Ba(é) © (EB) AE = Bs biki A &- 


y 


Since even monomials &; A & commute with each other, the nth power of Bg(&) is 
equal to 


Bp A Bp A+++ A Bp ent: &) Ab, A +++ A bon 
x SY) sgn(ivirinjo++indn) «Pini Dinis ++ Pints « 
{i fi fU LU fin gnd 
={1,2,...,2n} 


(16.41) 
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The sum here is the same as in (16.40). Let us denote it by Pf(B) and pass to new 
Grassmannian variables 7 = (71, 2,.--,n) related to & by § = 1 C, where C = 
detA € GL»,(K) is the same matrix as above. The right-hand side of (16.41) is 
now equal to 


n\- ELA& A see A &o, - PE(B) = n!- detC-n) Ano A wee A Non * PE(B). 


On the other hand, the quadratic polynomial 6g(€) under the substitution & = 7 C 
is equal to 


Ba(§) = (EB) A&' = (NCB) A (nC)! = (Nn CBC‘) An! = Bescr(n). 


Therefore, the nth power of Bg(E) is expanded through 771, 72,..., Mn as 


Bese(n) A Besct(n) A +++ A Boser(n) = nl m1 A 2 A +++ A ton: PE(CBC’), 


whence Pf(CBC") = Pf(B) - det C in the ring K[b,]. Evaluating the variables bj by 
the substitution B = J’ leads to the equality Pf(A) = det C in the field K = Q(aj). 
Hence /detA = detC = Pf(A) can be computed by formula (16.40) and therefore 
lies in Z[a,]. To prove that such a square root is unique, we note that the quadratic 
polynomial x? — detA = (x — Pf(A))(x + Pf(A)) € Zfa;][x] has just two roots, 
x = +Pf(A), in the integral domain Z[a;]. The normalization Pf(J’) = 1 uniquely 
determines one of them. 


Problems for Independent Solution to Chap. 16 


Problem 16.1 Give an example of a vector space V with a nondegenerate bilinear 
form 8 and subspace U C V such that the restriction of B to U is nondegenerate 
and Ut 4+. 

Problem 16.2 Give an example of a (nonsymmetric) correlation B : V > V* and 
subspace U C V such that V = U @ ker B and UN B-!(AnnU) F 0. 


Problem 16.3 Show that a bilinear form on V is regular*® if and only if Vt N 
v= 0, 

Problem 16.4 Show that for every nondegenerate indecomposable*’ bilinear form 
B, the symmetric part 64 = (6 + B*)/2 or skew-symmetric part B_ = a is 
nondegenerate”° as well. 


38See Sect. 16.1.7 on p. 392. 
3°See Sect. 16.4.2 on p. 405. 
4°Compare with Exercise 16.7 on p. 391. 
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Problem 16.5 Show that for every linear operator f on a space with nondegenerate 
bilinear form, E¢(f) = E@(f’) = E€(‘f). 

Problem 16.6 (Reflexive Operators) Show that for every reflexivet! operator f 
on a space with nondegenerate bilinear form: (a) kerfY = (imf)+ = “(im ft); 
(b) imfY = (kerf)+ = +(kerf). 

Problem 16.7 (Normal Operators) A reflexive operator f on a space with 
nondegenerate bilinear form is called normal if ffY = f’f. Show that 


(a) (Anti) self-adjoint and isometric operators are normal. 
(b) for every normal operator f, both orthogonals (imf)+ and (kerf)+ are f- 
invariant. 


Problem 16.8 (Isometries) Let f be an isometry of a nondegenerate bilinear form 
over an algebraically closed field k with chark 4 2. Show that f is similar to f~! 
and use this to prove™ that all Jordan chains with eigenvalues A # +1 in any 
Jordan basis of f split into disjoint pairs of chains with equal lengths and inverse 
eigenvalues. Prove that there exists a Jordan basis for f in which the linear spans 
of these pairs of chains are biorthogonal to each other. 

Problem 16.9 Show that the characteristic polynomial of every symplectic matrix 
F € Sp), (k) is reciprocal, i.e., xr (t) = ?" xr (t7'). In particular, det F = 1. 

Problem 16.10 Show that every Lagrangian subspace L in a symplectic space 
coincides with L+. 


Problem 16.11 For all d in the range 1 < d < n, show that the symplectic group 
Sp,, (2,) acts transitively on 2d-dimensional symplectic subspaces W C Qo, 
and on d-dimensional isotropic subspaces U C Qyp. 


Problem 16.12 Write an explicit expansion for the Pfaffian of a skew-symmetric 
6 x 6 matrix. 


Problem 16.13 Compute the 3-diagonal Pfaffian (all elements not shown are equal 
to zero) 


0 a 
—a{ 0 by 
pr} 7 
: Dn-1 
—by-1 0 an 
—d, O 


41See Sect. 16.3.1 on p. 400. 
“Independently of Theorem 16.2. 
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Problem 16.14 For arbitrary n € N and even m < n, consider an arbitrary skew- 
symmetric n x n matrix A and m x n matrix C. Prove that 


PE(CAC') = )\ PF(Ay) - det(C)), 


#1=m 


where the summation is over all collections J = (i,h,...,im) of strictly 
increasing indexes, and C;, Ay; denote square m x m submatrices situated in I- 
columns of C and in J-rows and J-columns of A. 

Problem 16.15 Check that the canonical operator of the nonsymmetric Euler’s form 
in Example 16.4 on p.394 equals T~"—! = e+) : F(t) & f(t —n— 1). Use 
this to show that Euler’s form is equivalent to U,+; over Q. 

Problem 16.16 (Braid Group Action on Exceptional Bases) Let the vectors e = 
(€0,€1, --., €n) form an exceptional basis for a nonsymmetric form f on V. 
Write L; and R; for the changes of basis that transform a pair of sequential vectors 
e;-1, e; by the rules 


L; : (€i-1, €7) > (Lej, e-1), where Le; = e; — B (e-1, ei) - ei-1 . 


Rj: (€i-1, ei) + (e;, Rei-1), where Re; = ei-1 — B (ei-1, ei) - e:, 


and leave all the other basis vectors fixed. Show that the transformations L;, R; 
for 1 < i < n take an exceptional basis to an exceptional basis and satisfy the 
relations 


LR; = RL; = Id, 
LLi41L; = Lyi Li Li+1 for all 1 <i<n- 1, 
LL; = LL; for | <ij<n and |i — || > 2. 
The group B,+1 presented by generators x1, .x2,...,X, and relators 
(xixigi)? forl <i<n—1 and (xx) forl <i,j<n,|i-j| >2 


is called the braid group of n + 1 strands. The symmetric group S,4 ; can be 
constructed from B,+1 by adding n extra relators x7 for 1 < i < n. There 
is an open conjecture that any two exceptional bases of the Euler form from 
Example 16.4 on p. 394 can be transformed to each other by means of the action 
of the braid group, the action of the isometry group, and the action of the group 


“That is, with upper unitriangular Gramian B,; see Example 16.4 on p. 394. 
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(Z/(2))"*! that multiplies basis vectors by +1. This conjecture has been verified 
only for n < 3. The case n = 2 is considered in the next problem. 


Problem 16.17 (Markov’s Equation) Let Bc : C? x C* > C be a nonsymmetric 
indecomposable bilinear form such that its restriction to Z> C C? takes integer 
values and has Gram determinant | in the standard basis of C*. Write 


CL2e SZ, 


for the restricted Z-bilinear form. Convince yourself that C? with the bilinear 
form fc is isometrically isomorphic to the space U3(C) from Example 16.6 on 
p. 398. Check that 6 has Gram determinant | in every basis of Z? over Z. 


(a) Let the Gramian of 8 in some basis u of Z° over Z be upper unitriangular: 


Ixy 
B,=|01z 
001 
Prove that x = 3a, y = 3b, z = 3c for some a,b,c € Z satisfying Markov’s 
equation 


a+bh’+c? =3abe. (16.42) 


(b) Verify that up to permutations, all positive solutions (a,b,c) € N? of the 
Markov equation (16.42) are obtained from (1, 1, 1) by successive transforma- 
tions replacing a solution (a, b, c) either by (3bc—a, b, c) or by (a, 3ac—b, c) 
or by (a, b, 3ab — c). 

(c) Prove that an exceptional basis*® of Z? over Z can be transformed to some 
basis with Gramian*” 


136 
013 (16.43) 
001 


by the action of the braid group B3 described in Problem 16.16 and changing 
the directions of the basis vectors. 


“4Hint: since (C?, BC) is isomorphic to U3, the canonical operator of f has tr B7 |! Bi, = 3. 

4They come from viewing equation (16.42) as a quadratic equation in one of variables a, b, c and 
replacing one known root by the other provided by Viéte’s formula. 

46That is, a basis with upper unitriangular Gramian. 

47Since the Gramian (16.43) coincides with the Gramian of the Euler form of rank 3, this verifies 
the conjecture from Problem 16.16 for rank-3 lattices. Moreover, we see that Euler’s form of rank 
3 is the unique integer bilinear form on Z? that is indecomposable over C and admits exceptional 
bases. 
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(d*) Markov’s conjecture famously asserts that a triple of positive integer solutions 
of equation (16.42) is uniquely determined by its maximal element, i.e., for 
all positive integer solutions a; > by = cy and az = bo = Co, the equality 
a, = a forces b} = by and c; = cp. The question has remained open for 
more than a century. 


Chapter 17 
Quadratic Forms and Quadrics 


Throughout this chapter, we assume that chark 4 2. 


17.1 Quadratic Forms and Their Polarizations 


17.1.1 Space with a Quadratic Form 


Homogeneous quadratic polynomials' g € S?V* over a vector space V are called 
quadratic forms. Every such polynomial may be considered a function g: V > k. 
A choice of basis x = (x1, .x2,...,%n) in V* allows us to write g as 


q(x) = So gyri. (17.1) 
ij 


where the summation is over all pairs of indices 1 < i,j < n, and the coefficients 
are symmetric; i.e., they satisfy qi = qj. In other words, for i 4 j, both coefficients 
qi = Gj are equal to half* the coefficient of x;x; in the reduced expansion of 
the polynomial g € k[x;,x2,...,X,] in terms of the monomials. The coefficients 
qi are organized in a symmetric n x n matrix OQ = (gi). In matrix notation, the 
equality (17.1) becomes 


g(x) =) xigyx; = xOx" (17.2) 
ij 


'See Sect. 11.2 on p. 258. 
*Note that for char(k) = 2, the representation (17.2) may be impossible. 
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where x = (X1,%2,...,Xn) means a row of variables, and x’ is a transposed column. 
We see that a quadratic function g : V > k can be written as q(v) = q(v, v), where 


q:VxV—>k, g(x,y) =xQy', 
is asymmetric bilinear form with Gramian Q in the basis of V dual to x. This bilinear 


form is called the polarization of the quadratic form q. It does not depend on the 
choice of basis and is uniquely determined by g as 


s 1 1 
qu, w) = (a + w) — a4) — a) = 7@U + w) —au—w)). (17.3) 
Exercise 17.1 Check this and verify that in coordinates, g(x,y) = 4 0; yi , 


Thus, for chark # 2, the polarization of quadratic forms establishes the linear 
isomorphism 


S°V* = Homi (V,V*), gh q, 


between homogeneous polynomials of degree 2 and symmetric bilinear forms on 
V. It allows us to transfer many notions related to symmetric bilinear forms to the 
world of quadratic forms. For example, a linear map f : U; — Up, between two 
spaces U,, U2 equipped with quadratic forms qi € S?U}, qo € S*UX is isometric,” 
that is, for all u,w € Uj, it satisfies gi(u,w) = qo(fu,fw) if and only if qi(u) = 
q2(fu) for all u € U;. We call such maps homomorphisms of spaces with quadratic 
forms. Bijective homomorphisms are called equivalences* of quadratic forms. Thus, 
a homomorphism of quadratic forms is nothing but a linear change of variables, 
and two quadratic forms are equivalent if one of them can be transformed to the 
other by an invertible linear change of variables. Classification of quadratic forms 
up to equivalence means the same as classification of symmetric bilinear forms up 
to isometry. 


17.1.2. Gramian and Gram Determinant 


The symmetric matrix Q in (17.2) is called the Gramian? of the quadratic form q in 
the coordinates x. It coincides with the Gramian of the bilinear form g in the basis 
e = (e1,€2,...,€n) of V dual tox = (x1, x2,...,Xn), Le., gy = (ei, ej). If bases x 
and y in V* are related by x = y-Cy,, then their Gramians Q,, Qy satisfy the relation 


3See Sect. 16.1 on p. 387. 
4Or isomorphisms. 
5Or Gram matrix. 
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Oy = CyxQxC,,. Therefore, under a change of basis, the determinant of the Gram 
matrix is multiplied by the nonzero square det” Cyx € k. We conclude that the class 
of det Q modulo multiplication by nonzero squares does not depend on the basis. We 
denote this class by det g € k/k*? and call it the Gram determinant of the quadratic 
form q. If detg = 0, then the quadratic form q is called singular.® Otherwise, q is 
called nonsingular.’ Let us write a ~ b for a,b € k such that a = Ab for some 
A €k*. If detg, ~ det qo, then g, and qo are certainly inequivalent. 


17.1.3. Kernel and Rank 


Let us write g: V > V*, vu > G(*, v) for the correlation map® of the symmetric 
bilinear form g and call it the correlation of the quadratic form q. The correlation 
takes a vector v € V to the covector u b> q(u, v). We write 


kerg = kerg = {v € V| Vue V qu, v) = 0} 


for the kernel of the correlation and call it the kernel of the quadratic form gq. A form 
q is nonsingular if and only if ker g = 0. The number 


rkq= dimimg = dim V/kergq 


is called the rank of the quadratic form q. It coincides with the rank of the Gramian 
of q in any basis of V. Quadratic forms of different ranks certainly are inequivalent. 

For a singular quadratic form gq on V, we write grea for the quadratic form on 
V/kergq defined by qrea([v]) = q(v). 


Exercise 17.2 Check that the form grea is well defined and nondegenerate. 


17.1.4 Sums of Squares 


A basis in V is called orthogonal for q if g has a diagonal Gramian in this basis, i.e., 
is a linear combination of squares: 


q(x) = ayx*) + anx’2 +++» +.4,x",, where r =rkq. (17.4) 


Or degenerate. 
7Or nondegenerate. 
8See Sect. 16.1 on p. 387. 
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By Lagrange’s theorem,’ every quadratic form admits an orthogonal basis over 


every field k with chark 4 2. Note that the number of nonzero coefficients in (17.4) 
is equal to rkq and therefore does not depend on the choice of orthogonal basis. 
If k is algebraically closed, then the substitution x; <4 x;/,/a; simplifies (17.4) to 
q(x) = xf +33 + +++ + x2. Therefore, over an algebraically closed field k of 
characteristic chark # 2, two quadratic forms are equivalent if and only if they 
have equal ranks. 


Example 17.1 (Binary Quadratic Forms) Quadratic forms in two variables are 
called binary. Every nonzero binary quadratic form 


q(x) = ax; + 2bxyx. + Cx, = (X1,X2) (; ’) (") 


be} \x 


in appropriate coordinates t,t becomes either a7, where a # 0, or at? + BG, 
where a, 8 # 0. In the first case, ac — b? ~ detg ~ a-0 = 0, i.e., the form gq is 
singular. It vanishes identically on the 1-dimensional subspace Ann(t,) C V and is 
nonzero everywhere outside it. In the second case, ac — b? ~ detg ~ aB 4 0 and q 
is nonsingular. If there exists some v = (3), 02) # O such that g(v) = a07+f05 = 
0, then —detg ~ —aB ~ —B/a = (3; /2)? is a square!” in k, and 


o 
qi) =ati+Bih=a Gok in 
Dy Dy 


is a product of two different linear forms. It vanishes identically on two different 1- 
dimensional subspaces and is nonzero everywhere outside them. 


Exercise 17.3 Show that ¢ is a hyperbolic form!! on k? in this case. 


If —detg is not a square, then g(v) # 0 for all v # 0. Such a form is called 
anisotropic. 


17.1.5 Isotropic and Anisotropic Subspaces 


A vector v # 0 is called isotropic for q if q(v) = 0. If q(v) # 0, then the vector v is 
called anisotropic. A subspace U C V is called anisotropic if all nonzero u € U are 
anisotropic, i.e., if g|y vanishes only at zero. A form q is called anisotropic if the 
whole space V is anisotropic. For example, the Euclidean quadratic form g(v) = 
|v|* is anisotropic. We have seen in Example 17.1 that a 2-dimensional subspace 


°See Theorem 16.4 on p. 409. 
‘Note that 0), 02 4 0, because ad; + BIZ = Oand v # 0. 
'See Example 16.2 on p. 393. 
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U over a field k with chark # 2 is anisotropic if and only if — det (g|y) is not a 
square in k. In particular, if k is algebraically closed, then there are no anisotropic 
subspaces of dimension > 2 over k. 

A subspace U C Vis called isotropic for a quadratic form g if every vector u € U 
is isotropic, i.e., g|U = 0. In this case, formulas (17.3) imply that g(u,, u2) = 0 for 
all uw, u2 € U. Thus, U is isotropic for a quadratic form q if and only if U is isotropic 
for the bilinear form q in the sense of Sect. 16.2.2 on p.395. By Proposition 16.2, 
the dimensions of isotropic subspaces for a nonsingular quadratic form are bounded 
above by dim V/2. 


17.1.6 Hyperbolic Forms 


Write H>, for the 2n-dimensional hyperbolic space! considered up to isomorphism. 
A convenient coordinate-free realization of H>, is the direct sum 


Ay, = U ® U*, where dimU = n, 


equipped with the symmetric bilinear form h (uw, + 1, u2 + &) “ & (v2) + (v1), 
where u1, U2 € U, &,& € U*. Both summands U, U* are isotropic. Mixed inner 
products of vectors and covectors are given by contractions: RE sv) = h(v, é) = 
(vu, &). Every basis of W arranged from dual bases 


o ok o 
C1, C25 + Ons Cp 9 Cy 5, (17.5) 


of U and U* is hyperbolic, that is, has Gramian 


(0). 


where both the zero and identity blocks 0, E are of size nxn. The vectors p; = e;+e; 
and q; = e;—e; oa an orthogonal basis of ‘hh. Their inner squares are h( Pi» Pi) = 2 
and h(qi, qi) = 

A anode to ra h(v) = hv, v) obtained from a bilinear form h also is called 
hyperbolic. It provides h(u + €) = 2&(u). In coordinates x with respect to the 
hyperbolic basis (17.5), a hyperbolic quadratic form looks like 


h(x) = 2x1 xXn41 + 2x42 + +++ + 2px, 


'2See Example 16.2 on p. 393. 
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and after renaming 2x; by x; for 1 < i < n, it becomes 


XyXnt1 + XXnp2 + r++ + AXpX2n. 
In coordinates z with respect to the orthogonal basis p1, p2,...,Dns G1, q2.+++5 Qn; it 
is 
h@) = 27, + 2% + ++ + 2g, = Wey — Wap — 7 Uy 


Exercise 17.4 Check that the orthogonal direct sum H2, @ Ho», is equivalent to 
FAy(k-+m)- 


Hyperbolic spaces are much like the symplectic spaces considered in Sect. 16.6 
on p. 411. For example, there is a symmetric analogue of Proposition 16.7 on p. 413. 


Proposition 17.1 Let V be a vector space with nondegenerate symmetric bilinear 
form B. Then every isotropic subspace U C V is contained in some hyperbolic 
subspace W C V of dimension dim W = 2 dim U. Every basis of U can be extended 
to a hyperbolic basis of W. 


Proof Choose a basis u1,U2,...,Um in U, extend it to a basis in V, and write 
uy, uy eee uy for the first m vectors of the dual basis with respect to 6. Therefore, 
1 fori=j, 
B (ui,u") = ; (17.6) 
0 fori¥~/s. 


These relations remain valid if we add any linear combination of vectors u; to any 
u;. Replacing each u;” by the vector 


we get a collection of m vectors w;,W2,...,Wm Satisfying the same orthogonality 
relations (16.38). The linear span of the vectors w; is isotropic, because for all 


1 1 
B(wi, Wj) = Bluy,u;) a Buy, uy) a B(uY, uy) =0 1l<ij<m. 
Thus, the vectors u; and w; for 1 < i,j < m form a hyperbolic basis of their linear 
span. It remains to denote the latter by W. oO 


Theorem 17.1 Every vector space V with nondegenerate symmetric bilinear form 
is an orthogonal direct sum V = Hx ® U, where Hy, C V is some hyperbolic 
subspace, U = Hx. is anisotropic, and either of the two summands may vanish. 
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Proof Induction on dim V. If V is anisotropic (in particular, for dim V = 1), there 
is nothing to prove. If there exists some isotropic nonzero vector e € V, then by 
Proposition 17.1, e is contained in some hyperbolic plane H. Since the form is 
nondegenerate in this plane, V splits into the orthogonal direct sum V = Hy © He 
If Hy = 0, we are done. If not, then by the inductive hypothesis, H+ = Hom ® U, 
where U = He is anisotropic. Therefore, V = H2,42 © U. oO 


Corollary 17.1 Every quadratic form q in appropriate coordinates is of the form 
XiXiH1 + XoXip2 + ++ + HX + OCCDi41, A425--- » Xr), 


where r = rk(q) and a(x) 4 O for x # 0. Oo 


17.2. Orthogonal Geometry of Nonsingular Forms 


17.2.1 Isometries 


Let V be a finite-dimensional vector space equipped with a nonsingular quadratic 
form q € S?V* with polarization g : V x V > k. Recall'? that a linear operator 
f: VY — Vis called orthogonal (or isometric) with respect to q if 


q(fu,fw) = qtu,w) Vu,wev. 


By formula (17.3) on p. 422, this condition is equivalent to g(fv) = q(v) for all 
v € V. We know from Sect. 16.2.3 on p. 396 that the isometries of g form a group. 
In the context of a quadratic form, this group is called the orthogonal group of the 
nonsingular quadratic form g and is denoted by 


OV) = {f € GL(V) | Vu € V g(fv) = qiv)}. 


Example 17.2 (Isometries of the Hyperbolic Plane) Let the linear operator f : 
Hy — Hp have matrix F = (: *) in a hyperbolic basis e, e* of Hz. Then F is 
c 


orthogonal if and only if 


0O1\_ fac ; 01 ; ab\ _ 2ac ad+bc 
10) \bd 10 cd) \ad+bce 2bd }’ 


3See Sect. 16.2.3 on p. 396. 
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ie., when ac = bd = 0, ad + bc = 1. This system of equations has two families of 
solutions: 


R= & ne and F, = ( : a , where A runs through k*. (17.7) 


For k = R, the operator F, with A > 0 is called a hyperbolic rotation, because 
the trajectory Fv of every vector v = (x, y) as A runs through the open ray (0, co) 
is a hyperbola xy = const. If we put A = e’ and pass to an orthonormal basis 
p = (e + e*)/V/2, q = (e — e*)//2, the operator Fy, acquires the matrix 


1/J/2 1/V2 e 0 1//2 1/V2\ _ (cosht sinht 
(i5_ina) , & 2) eee 7 (Gee oF 
which looks quite similar to the matrix of Euclidean rotation. For A < 0, the operator 
F) is a composition of a hyperbolic rotation with the central symmetry with respect 
to the origin. In both cases, F, are proper and preserve oriented areas, i.e., lie in 
SL(R’). The operators F, are not proper and can be decomposed as hyperbolic 


rotations followed by reflection in an axis of the hyperbola. They preserve Euclidean 
area'* but reverse the orientation. 


17.2.2 Reflections 


As in Euclidean geometry,'> every anisotropic vector e € V provides V with an 
orthogonal decomposition V = k- e @ e+, where et = {v € V | G(e, v) = OF is 
the orthogonal hyperplane of e. Associated with this decomposition is the reflection 
in the hyperplane e+ defined by 


Her) 


Oe:V—>V, vr o(v)#£v—-22 
q(e,e) 


(17.8) 


It acts identically on e+ and takes e to —e (see Fig. 17.1). Therefore, o. € O, and 
oe = 1, 


Exercise 17.5 Verify the equality 
fe oeef | = of) 


for every isometry f : V — V and anisotropic vector e € V. 


'4That is, the absolute value of the oriented area. 
'SSee Example 10.12 on p. 246. 
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Fig. 17.1 Reflection o, 


Fig. 17.2 Reflections in a 
rhombus 


Lemma 17.1 In a space with a nonsingular quadratic form q, for every pair of 
anisotropic vectors u, w such that q(u) = q(w), there exists a reflection that sends u 
either to w or to —w. 


Proof If u, w are collinear, then the reflection o,, = o,, sends them to each other. Let 
u, w span the 2-dimensional plane U. The two diagonals of the rhombus spanned by 
u, v are u+w and u—v. They are perpendicular: g(u+ v,u—v) = q(u)—q(v) = 0. 
Since U is not isotropic for q, at least one of the two diagonals, which form an 
orthogonal basis in U, has to be anisotropic.'° Therefore, at least one of the two 
reflections in the diagonals is well defined (see Fig. 17.2). The reflection o,,_,, takes 
u to w; the reflection 0,4, takes u to —w. oO 


‘6Otherwise, the Gram matrix of the basis formed by the diagonals is zero. 
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Exercise 17.6 Verify the last two statements and show that if V is anisotropic, then 
there exists a reflection that sends u exactly to w. 


Theorem 17.2 Every isometry of an n-dimensional space with a nonsingular 
symmetric form is a composition of at most 2n reflections in hyperplanes. 


Proof Induction on n. For n = 1, the orthogonal group of the anisotropic line 
consists of the identity E and reflection —E, because the linear map v / Av 
satisfying q(v) = q(Av) = A*q(v) has A = +1. Now let f : V > V be an 
isometry of an n-dimensional space for n > 1. Choose any anisotropic vector v 
in V and write o : V — V for the reflection that sends f(v) either to v or to —v. The 
composition of maps v +> +£u, and in both cases, sends the hyperplane v~ to itself. 
By the inductive hypothesis, the restriction of of to vt is a composition of at most 
2(n — 1) reflections within v+. All these reflections can be considered reflections 
of V in hyperplanes containing the vector v. Then the composition of these 2n — 2 
reflections of V equals either of or o,0f. Therefore, f, which equals either o (of) or 
00,(oy,of), is a composition of at most (2n — 2) + 2 = 2n reflections. Oo 


Exercise 17.7 Show that every isometry of an anisotropic space V is a composition 
of at most dim V reflections in hyperplanes. 


Lemma 17.2 (Witt’s Lemma) Let U, V, W be spaces with nonsingular symmetric 
bilinear forms. If the orthogonal direct sum U ® V is isometrically isomorphic to 
the orthogonal direct sum U ® W, then V and W are isometrically isomorphic. 


Proof Induction on dim U. If dim U = 1, then U = k- u, where u is anisotropic. 
Write 


f:k-u@Vsok-u®@w 


for the isometric isomorphism in question. Let o be the reflection of the second 
space such that o(f(u)) = +u. Then the composition of maps ut = V to!” ut = 
W, as required. For dim U > 1, we choose some anisotropic vector u € U and apply 
the inductive hypothesis to the orthogonal decompositions U@V = k-u@® (ut 8 V) 
and U@ W =k-u ® (ut @ W) with k- uw in the role of U. We get an isometric 
isomorphism between ut @ V and ut @ W. Now apply the inductive hypothesis 
with u+ in the role of U to get the required isometry between V and W. oO 


Corollary 17.2 Let V be a space with a nonsingular symmetric bilinear form. Then 
in the orthogonal decomposition V = Hx ® U from Theorem 17.1 on p. 426, the 
hyperbolic space H>, and anisotropic space U are determined by V uniquely up to 
isometry. In other words, given two such orthogonal decompositions 


V = Ay, ®U = Hy», ® W, 


"The first orthogonal complement is taken in U @ V, the second in U ® W. 
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the anisotropic spaces U, W are isometrically isomorphic and the hyperbolic spaces 
have equal dimensions 2k = 2m. 


Proof Let m 2 k, that is, Han = Hox ® Hom—x. Since there is an identical isometry 
Idy : Hx, @ U > Ax © Hym--_ BW, 


by Lemma 17.2 there exists an isometry U + Hy(m—1~ @ W. It forces Ha(m—zy = 0, 
because there are no isotropic vectors in U. Thus, k = m, and U is isometrically 
isomorphic to W. oO 


Corollary 17.3 Let V be a space with a nonsingular quadratic form q and let 
U,W C V be some subspaces such that both restrictions q|y, q|w are nonsingular. 
Then an isometry g : U & W, if such exists, can be extended (in many ways) to an 
isometric automorphism of V coinciding with @ on U. 


Proof It is enough to show that there exists some isometric isomorphism 
vy:U'=w. 


Thng Oy: U@U+s>WeWe, (uw) & (g(t), W(w)) is a required 
automorphism of V. By our assumptions the maps 


n:U@UtSYV, (uwu)reutu, 
¢:U@WtoV, (u, w’) + g(u) + w', 
both are isometric isomorphisms. The composition 
c'n:u@uUtsuewt 


is an isometric isomorphism as well. By Witt’s lemma, Ut and W~ are isometrically 
isomorphic. Oo 


Corollary 17.4 For each integer k in the range 1 < k < dimV/2, the orthogonal 
group of every nonsingular quadratic form on V acts transitively on k-dimensional 
isotropic subspaces and on 2k-dimensional hyperbolic subspaces in V. 


Proof The claim about hyperbolic subspaces follows from Corollary 17.3. It implies 
the claim about isotropic subspaces by Proposition 17.1. Oo 


17.3 Quadratic Forms over Real and Simple Finite Fields 


In this section we enumerate all quadratic forms over R and F,, up to isometry. 
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17.3.1 Quadratic Forms over F, = Z/(p) 


Let p > 2 be prime number. Write e € F, ~ F’ for a nonsquare, fixed once and 
for all. We know from Sect. 3.6.3 on p.65 that the nonzero squares in F,, form a 
subgroup of index 2 in the multiplicative group ee This means that every nonzero 
element of IF, is equivalent either to 1 or to ¢ modulo multiplication by nonzero 
squares. In particular, for every anisotropic vector v, there exists A € FF such that 


q(Av) = A?q(v) equals either | or e. 


Lemma 17.3 For every a,b € Fi and c € F), the equation axt + bx} = c is 
solvable in x,,x2 € Fp. 


Proof As x,, X2 run independently through F’,, both quantities axt and c — bx5 take 
(p + 1)/2 different values in F,,. Since |F,| = p, these two sets of values have at 
least one common element ax} = c — bx}. Oo 


Proposition 17.2 Every quadratic form q of rank r over Fy, p > 2, is equivalent to 
bs foeee fp 4 + a if det Grea € ee and is equivalent to x fore f cae + ex? if 
det qrea ¢ ae where qrea is the reduced form'® on V/ ker q. 


Proof By Lagrange’s theorem, Theorem 16.4 on p. 409, 
q(x) = ax; + Oax5 feeet Op x? 


in appropriate coordinates. It is enough to show that every linear combination of 
two squares @jx; + ax; can be rewritten either as xj + x3 or as x7 + €x3. This can 
be done by means of Lemma 17.3. Write U for the linear span of vectors e;, @;. 
By Lemma 17.3, the map g|y : U — k is surjective. Hence, there exists u € U 
with g(u) = 1. The restriction of g to the 1-dimensional orthogonal to u in U is 
nonsingular, because both q|y and q|p,., are nonsingular. Therefore, the orthogonal 
to win U is anisotropic and contains a vector w with q(w) equal to ether lore. O 


Proposition 17.3. Up to isometry, there are exactly three anisotropic forms over Fy 
with p > 2, namely ae on, and either bt + com for p = —1 (mod 4), or x + EX5, for 
p = | (mod 4). 


Proof By Lemma 17.3, every form ax} + bx5 + cx} +--+ of rank > 3 vanishes on a 
nonzero vector (@, @2, 1, 0, ...) such that aos + bas = —c. Therefore, anisotropic 
forms over F,, may exist only in dimensions 1 and 2. Both 1-dimensional nonzero 
forms x7, ex} certainly are anisotropic. The nondegenerate 2-dimensional forms up 
to isometry are exhausted by qi = x7 + x3 and q = xj + ex5. We know from 


'8See Exercise 17.2 on p. 423. 
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Example 17.1 on p.424 that g; has an isotropic vector if and only if —detg; = —1 
is a square in F,. By Sect. 3.6.3 on p. 65, this happens exactly for p = | (mod 4). 
Since det gz = ¢ det q,, the second form is anisotropic if and only if the first is not. 

oO 


17.3.2. Real Quadratic Forms 


By Theorem 16.4, every quadratic form g with real coefficients can be written in 
coordinates!” as 


g(a) Say ag et = ad = a (17.9) 


The numbers p and m in (17.9) are called the positive and negative inertia indices of 
q, the ordered pair (p,m) is called the signature of q, the difference p — m is called 
just the index of q. Let us verify that all these quantities do not depend on the choice 
of coordinates, where g looks like (17.9). 

Of course, p + m = rkqgq does not depend on the basis. Replacing V by 
V/kerg, we may assume that q is not singular. Then V splits into an orthogonal 
direct sum of hyperbolic and anisotropic subspaces, both unique up to isometry. 
Such a decomposition is read off easily from the presentation (17.9). Namely, each 
pair of orthogonal coordinates with signature (1,1) produces a hyperbolic plane: 
x _ < = (x1 + X2) (41 — x2) = 2yiy2, where yj 2 = (x) x)//2. The remaining 
sum of either purely positive or purely negative squares is anisotropic. Therefore, 
the dimension of the hyperbolic component of g is 2 min( p,m), and the dimension 
of the anisotropic component is |p — m|. The sign of the difference p — q shows 
which of the two inequivalent anisotropic forms w appears in g: positive,?° which 
gives a(v) > 0 forall v ¥ 0, or negative, which gives a(v) < 0 for all v 4 0. Since 
p,q are uniquely recovered from p + m and p—™m, we get the following proposition. 


Proposition 17.4 Two real quadratic forms are equivalent if and only if they have 
equal ranks and indices. For each n € N, there are exactly two anisotropic forms of 
rank n. They have indices +n. oO 


Exercise 17.8 Show that p = maxdimU over all U Cc V such that qly is 
positive anisotropic, and g = max dim U over all U C V such that q|y is negative 
anisotropic. 


'°Choose any orthogonal basis ¢), é2,..., e, for q and divide all e; such that g(e;) 4 0 by V|g(e;)|. 
20That is, Euclidean. 
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17.3.3 How to Find the Signature of a Real Form 


Fix a basis in V and write V;, for the linear span of the first k basis vectors 
€1,€2,..., ex. Let Ax be the Gram determinant of the restriction q|v,, or equivalently, 
the principal upper left k x k minor! of the Gram matrix of q. It vanishes if and only 
if gly, is singular. Otherwise, the sign of A, equals (—1)’*, where m, is the negative 
inertia index of g|y,. Reading the sequence Aj, Az,..., Aaimy from left to right, we 
can track signature changes under passing from V; to V;4, or detect the appearance 
of isotropic vectors. In many cases, such an analysis allows us to determine the total 
signature of q. 

For example, let A; <0, Ap = 0, Az > 0, Ayg=0, As = 0, Ag < 0. 
Since q|y, is singular, V2 is the direct orthogonal sum of the negative anisotropic 
line Re, and an isotropic line. Since V3 is not singular, the orthogonal to e; within 
V3 is nonsingular and contains an isotropic vector. Hence it must be a hyperbolic 
plane, i.c., V3; = Re; @ A». Note that this forces A3 to be of opposite sign with 
respect to Aj, and this agrees with our data. 


Exercise 17.9 Let A;_; 4 0, A; = 0, and Aj+; 4 0 in a sequence of principal 
upper left minors. Show that A;-; Aj+; < 0 and Vi41; = Vi-1 @ Ad. 


Thus, V3 has signature (1,2). The same arguments as above show that ye is 
nondegenerate and contains an isotropic vector. Therefore, the signature of V+ is 
either (2, 1) or (1, 2). Since A3 and Ag are of opposite signs, we conclude that the 
first case occurs. Thus, the total signature of g is (1,2)+(2, 1) = (3, 3). An example 
of such a form is provided by the Gramian 


—-1 


ooooc°o 
oooroc°9o 
oooor oO 
re oooc oO 
oroonoeo°9e 
ooroeoec9#$#jceo 


If all minors A, are nonzero, then all restrictions g|y, are nonsingular, and the 
signs of A;,, and A; are different if and only if m;+; = m; + 1. Therefore, the 
negative inertia index m equals the number of sign inversions in the sequence 
1, Aj, Ao,..., Adimy, in this case. This remark is known as Sylvester’s law of 
inertia. 


*IThat is, the determinant of the submatrix formed by the topmost k rows and leftmost k columns. 
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17.4 Projective Quadrics 


17.4.1 Geometric Properties of Projective Quadrics 


A projective hypersurface? Q = Z(q) = {v € P(V) | q(v) = 0} formed by 
isotropic vectors of a nonzero quadratic form q € S*V* is called a projective 
quadric. Quadrics in P, are called conics, and quadrics in P3 are called quadratic 
surfaces. Since proportional forms define the same quadric, projective quadrics 
correspond to the points of the projective space of quadrics~* P(S*V*). Two 
projective quadrics in P(V) are called projectively equivalent” if there exists a linear 
projective automorphism P(V) + P(V) sending one of them to the other. 

As above, we write 9 : V x V — ki for the polarization of a quadratic form 
q € S°V*, and q : V > V* for the correlation map,”° which takes a vector v € V to 
the covector u + q(u, v). The projectivization of the kernel of the correlation map, 


Sing Q # P(ker 9) = P{w € V | Vue V G(u,w) = 0}, (17.10) 


is called the singular locus?’ of the quadric Q = Z(q). A quadric Q is called 
smooth’ if Sing Q = @. Otherwise, Q is called singular or degenerate. Note that 
Sing Q C Qisa projective subspace lying on Q. In particular, every singular quadric 
is nonempty. All points a € Sing Q are called singular points of Q; all the other 
points a € Q ~ Sing Q are called smooth. 


Example 17.3 (Quadrics on the Projective Line, the Geometric Version of Exam- 
ple 17.1) In Example 17.1 on p. 424, we have seen that on P) there exists a unique, 
up to isomorphism, singular quadric. It may be given by the equation x} = 0 and is 
called a double point. Smooth quadrics Q = Z(q) are of two types. If —detgq is a 
square, then Q may be given by the equation xox; = 0, which constitutes a pair of 
distinct points. If — det g is not a square, then Q = @. The latter never happens over 
an algebraically closed field. 


Corollary 17.5 The positional relationships between a line £ and a quadric Q in 
projective space are exhausted by the following alternative variants: either £ C Q, 
or £1 Q is a double point, or €N Q is a pair of distinct points, or £1 Q = @. The 
last case is impossible over an algebraically closed field. oO 


2See Sect. 11.3 on p. 263. 

3See Sect. 11.3.3 on p. 265. 

4 Or isomorphic. 

>See Sect. 17.1 on p. 421. 

6See Sect. 16.1 on p. 387. 

°70r vertex subspace. 

°8 Other names: nonsingular and nondegenerate. 
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Definition 17.1 (Tangent Lines and Tangent Space) Let Q be a projective quadric 
anda € Qapoint. A line £ 5 ais called tangent to Q at aif either 2 C Qor£NQ 
is a double point at a. The union of all lines tangent to a quadric Q at a point a € Q 
is called the tangent space of Q at a and is denoted by T,Q. 


Lemma 17.4 A line (ab) is tangent to a quadric Q = Z(q) at a point a € Q if and 
only if q(a, b) = 0. 


Proof Write U C V for the liner span of the vectors a,b € V. The restriction 
; 0 i) Pe 

q(a, b) q(b, b) 

det G = G(a,b)? = 0. Oo 


Corollary 17.6 [fa point a € Q = Z(q) C P(V) is smooth, then 


q|v is either zero or singular if and only if its Gramian G = ( 


T,Q = {x € P, | g(a, x) = 0} 


is a hyperplane. If a € Sing Q, then T,Q = P(V), i.e., every line passing through a 
either lies on Q or does not intersect Q anywhere outside a. 


Proof The linear equation g(a,x) = 0 either defines a hyperplane or is satisfied 
identically in x. The latter means that a € ker q. oO 


Remark 17.1 By Exercise 17.1 on p.422, the linear equation g(a,x) = 0, which 
determines the tangent space of the quadric Z(q) at the point a € Z(q), can be 
written as >> #4 (a) -x; = 0. In particular, a point a is singular if and only if all 
partial derivatives 0q/0x; vanish at a. 


Corollary 17.7 (Apparent Contour) The apparent contour of a quadric Q = Z(q) 
viewed from a point” b ¢ Q is cut out of Q by the hyperplane 

b+ = Anng(b) = {x | (b,x) = 0}. 
Proof If b ¢ Q, then q(b, b) = q(b) # 0. Therefore, the linear equation q(b, x) = 0 
is nontrivial and defines a hyperplane. 5 


Proposition 17.5 [f a quadric Q C P, has a smooth point a € Q, then Q is not 
contained in a hyperplane. 


Proof For n = 1, this follows from Example 17.3. Consider n > 2. If Q lies within 
a hyperplane H, then every line £ ¢ H passing through a intersects Q only in a and 
therefore is tangent to Q at a. Hence, P,, = HUT,Q. This contradicts Exercise 17.10 
below. Oo 


*°That is, the locus of all points a € Q such that the line (ba) is tangent to Q at a. 
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Exercise 17.10 Show that projective space over a field of characteristic # 2 is not 
a union of two hyperplanes. 


Theorem 17.3 Let Q C P,, be an arbitrary quadric and L C Py a projective 
subspace complementary” to Sing Q. Then Q' = LQ Q is a smooth quadric in 
L, and Q is the linear join*' of Q’ and Sing Q. 


Proof The smoothness of Q’ follows from Proposition 16.6 on p.409. Every line 
intersecting Sing Q either belongs to Sing Q or can be written as (ab) for some 
a € Sing Q and b € L. By Corollary 17.6, the line (ab) either lies on Q, in which 
case b € Q’, or does not intersect Q anywhere except at a. oO 


Example 17.4 (Singular Conics) Let C C P>2 be a singular conic. If SingQ = s 
is a point, then by Theorem 17.3, C is formed by lines joining s with the points of 
a nonsingular quadric Q’ C € living within some line ¢  s. If Q' # @, then C 
is a pair of crossing lines. If Q’ = @, then C = s, and this never happens over 
an algebraically closed field. If Sing C is a line, then C is exhausted by this line, 
because the nonsingular quadric within the complementary point Po is empty. Such 
a C is called a double line, because its equation is the perfect square of a linear form. 


Exercise 17.11 Show that every rank-1 quadratic form is the perfect square of a 
linear form and give a geometric classification of singular quadratic surfaces in P3 
similar to Example 17.4. 


Example 17.5 (Veronese Conic) Consider P; = P(U) and write S’P, for the set 
of unordered pairs of points {a,b} C P1, where a = b is allowed as well. Over an 
algebraically closed field, there is canonical bijection 


S°P| > P) = P(S?U*), 


which takes {a, b} C P(U) to the quadratic form qa4(x) = det(x, a) - det(x, b), the 
equation for Q = {a,b} in some homogeneous coordinates x = (xo : x;) on P,, 
where 


, x 
det(x, p) “ det é : = pixo — pox forp = (po: pi) € Pi. 
1P1 


In the basis x3, 2xox1, x7 for S7U*, where the quadratic form Dox9 + 20) xox1 + Poxt 
has coordinates (Jo : 0, : 02), the form gz,(x) corresponding to the points a = 
(a : a) and b = (Bo : Bi) has coordinates (—2a1 6; : a8; + a1 Bo : —2a0Bo). 


3°See Sect. 11.4 on p. 268. 
3! That is, the union of all lines (ab), where a € Q’, b € Sing Q. 
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Pairs of coinciding points {a, a} are mapped bijectively onto the Veronese conic 
C Cc S?U*, which consists of rank-one quadratics*” and is described by the equation 


Bo V1 2 
det = Voto —0, = 0. 17.11 
& o> 0¥2 1 ( ) 
For fixed a and varying b, the quadratic forms ga, form a line in S7U* meeting C 
in exactly one point qq,. Identifying the points of P, with pairs {a,b} C P, and 
Veronese’s conic with double points {a, a}, we can say that the line tangent to C at 
{a, a} consists of all {a,b}, b € Pj. 


Exercise 17.12 Show geometrically that each nontrivial involution*®’ of P, has 
exactly two distinct fixed points and construct a bijection between involutions and 
points of S?P; ~ C. 


Proposition 17.6 Let C C P2 be a smooth conic and D C P2 a curve of degree d. 
Then either C C D or C intersects D in at most 2d points. 


Proof Let D = Z(f) and C = Z(q). If C = @, there is nothing to prove. If there is 
some a € C, choose any line £ # a and consider the projection z, : C > € froma 
onto £. It is bijective, because each line (ab) 4 T,C, b € £, intersects C in exactly 
one point c # a, namely, in 


c(b) = q(b)-a+q(a,b)-b. 


Exercise 17.13 Verify this and check that for b = T,C N £, we getc =a. 


The inverse map g : £ > C, b+ c(b) = (ab) NC, provides C with a homogeneous 
parametrization quadratic in b. Substituting x = c(b) in the equation for D, we get 
F (q(b) -a+ (a, b)-b) = 0, which is either the trivial identity 0 = 0 or a nontrivial 
homogeneous equation of degree 2d in b. In the first case, C C D. In the second 
case, there are at most 2d roots. oO 


Exercise 17.14 Let k be a field with chark 4 2. Show that: 


(a) Every five points in P2 lie ona conic. 
(b) If no four of the points are collinear, then such a conic is unique. 
(c) If no three of the points are collinear, then the conic is smooth. 


3?That is, perfect squares; see Exercise 17.11. 


33That is, a nonidentical linear projective automorphism o : P; ~ P; such that o? = Idp,. 
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Example 17.6 (Segre Quadric) Fix a 2-dimensional vector space U, put W = 
End(U), and consider P; = P(W). Write 


Xo X1 


QO, “{F :€ End(U) | det F = 0} = { ( ) | xox3—X1x9 = o} CP; (17.12) 


X2 X3 


for the quadric formed by endomorphisms of rank | considered up to proportional- 
ity. This quadric is called Segre’s quadric. Every rank-1 operator F : U — U has 
a 1-dimensional image spanned by some vector v € U, uniquely determined by F 
up to proportionality. Then the value of F on every u € U equals F(u) = E(u): v, 
where € € U* isa linear form such that Ann & = ker F. Therefore, & is also uniquely 
determined by F up to proportionality. Conversely, for every nonzero v € U, 
— € U*, the operator 


E@vi:Uuovu, ur EW), 
has rank 1. Therefore, there is a well-defined injective map 
s:P(U*)x PU) ~ PEnd(U), (€&v) PP E@v, (17.13) 


whose image is precisely the Segre quadric (17.12). The map (17.13) is called 
Segre’s embedding. 

Every 2 x 2 matrix of rank | has proportional rows and proportional columns. If 
we fix either of the ratios: 


([row 1] : [row 2]) = (f : t), 
({column 1] : [column 2]) = (& : &1), 


then we get a 2-dimensional vector subspace in W formed by the rank-1 matrices 
with the fixed ratio. After projectivization, it is a line on the Segre quadric. We 
conclude that Segre’s quadric is ruled by two families of lines, the images of 
“coordinate lines” P¥ x v and € x P; on Pf x P; under the Segre embedding (17.13). 
Indeed, the operator § ®@ v coming from & = (& : &:) € U* andv = (t0:t1) EU 


has matrix 
to\ _ (Soto €ito 
(‘") (& £1) = =) (17.14) 


with prescribed ratios between rows and columns. 

Since the Segre embedding establishes a bijection between P{ x P; and Qs, the 
incidence relations among the coordinate lines in PS x P, are the same as among 
their images in Qs. This means that within each ruling family, all the lines are 
mutually skew,** every two lines from different ruling families are intersecting, and 


34That is, nonintersecting (or complementary). 
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each point on the Segre quadric is an intersection point of exactly two lines from 
different families. Moreover, all lines £ C Qs are exhausted by these two ruling 
families, because every line £ C Q, belongs to Q; N T,Qs for some a € Qs, and the 
conic T,Qs5 M Qs is exhausted by two ruling lines crossing at a. 


Exercise 17.15 Show that (a) every nine points, (b) every three lines in P3 lie on 
some quadric. Prove that the quadric passing through three mutually skew lines is 
unique and is isomorphic to the Segre quadric. 


17.4.2. Smooth Quadrics 


For every nonsingular quadratic form g € S*V*, the orthogonal operators F € 
O,(V) produce linear projective automorphisms P(V) + P(V) sending the quadric 
Q = Z(q) to itself. They are called the automorphisms of the smooth projective 
quadric Q. By Corollary 17.4, the automorphisms of Q act transitively on the points 
of Q and on the projective subspaces L C Q of any given dimension. In particular, 
max dim L taken over all subspaces L C Q passing through a given point a € Qis the 
same for all a € Q. This maximal dimension is called the planarity of Q and denoted 
by m(Q). Smooth quadrics of different planarities are clearly not isomorphic. 

By Corollary 17.2 on p. 430, every smooth quadric Q = Z(q) C P, = P(V) can 
be given in appropriate coordinates by the equation 


XX, + X2X3 + +++ + X2mX2m41 + A(Xam+42, --- Xn) =O, (17.15) 


where a(x) 4 0 for all x # 0, and m+ 1 € N is equal to the dimension of the 
maximal isotropic subspace of gq in V. We conclude that the quadric (17.15) has 
planarity m. Therefore, for different m, the equations (17.15) constitute projectively 
inequivalent quadrics. Note that the planarity m = —1 is allowed as well and means 
that Z(q) = Z(a) = ©. 

Example 17.7 (Smooth Quadrics over an Algebraically Closed Field) Over an 
algebraically closed field, the only anisotropic form is the 1-dimensional form x. 
Hence, for each n € N, there exists exactly one smooth n-dimensional quadric 
Q, C Py»+1 up to isomorphism. For even n = 2m, it can be defined by the equation 


Xoxy + X2%3 + +++ + XomXam+1 = 0; (17.16) 
for odd n = 2m + | it can be defined by the equation 


2 
XoX1 + X2X3 + +++ HF X2mX2m+1 = XZm-42 - (17.17) 
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The planarity of both quadrics (17.16), (17.17) is m, that is, some m-dimensional 
projective subspace L C Q, can be drawn through each point a € Q,, and there are 
no (m + 1)-dimensional subspaces lying on Q). 


Example 17.8 (Smooth Real Quadrics) Fork = R, in each dimension k € N there 
exists a unique, up to a sign, anisotropic form 


04 (X1,%2,...,%) =X tO+ tee +x. 


Hence, every smooth n-dimensional quadric in P,,4.,; = P (R"t?) can be represented 
by the equation 


2 2 2 
XX, + X2X3 + +++ + XomXam+1 = Xom+2 + Xom43 fore $ Xn+1> (17.18) 


where —1 < m < n/2. We denote this quadric by Q,,, and call it an n-dimensional 
m-planar smooth real quadric. The quadratic form (17.18) has signature 


(n+2—m,m). 


and indexn + 2 — 2m. Any one of these quantities characterizes a smooth n- 
dimensional real quadric uniquely up to isomorphism. For m = 0, all quadrics 
Qnm are projectively inequivalent. All (—1)-planar quadrics Q,,—; with anisotropic 
equations a + Prt + +++ + 2° = 0 are empty. 

In orthogonal coordinates, the quadric Q,,,, is given by the equation 

BH toe ttn = Sint F bint Hot hn 

The hyperbolic coordinates x, are expressed in terms of orthogonal coordinates f, as 
Xi = tmtit th, Xai = tet iti forO <i< mand x; =f for 2m+2 <j <n+2. 

The 0-planar quadric % = # + + -+- + &, is called elliptic. All quadrics 
of higher planarities are traditionally called hyperbolic, although this is not quite 
correct, because the equation of such a quadric is a purely hyperbolic form only for 
n= 2m. 


Proposition 17.7 For every smooth quadric Q and hyperplane I], the intersection 
TIN QC [either is a smooth quadric in 11 or has exactly one singular point p. In 
the latter case, 11 = T,Q, and the quadric 11 M Q is a cone with vertex p over the 
smooth quadric QO! = TI'NQ cut out of Q by any codimension-\ subspace I1' C T,Q 
complementary to p. Furthermore, for the planarity, we have m(Q’) = m(Q) — 1. 


Proof Let Q = Z(q) C P(V) and IT = P(W). Then 


dim ker (g|) = dim ( Wong '(AnnW) ) < dimq' (Ann W) 


= dim Ann W = dim V — dimW = 1. 
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Let dimker (g|w) = 1 and suppose that p spans the kernel. Since p € Q/N II 
and Ann (q(p)) = W, we conclude that T,Q = II. Conversely, if 1 = 7,Q = 
P(Annq(p)) for some point p € Q, then p € Anng(p) lies in the kernel of the 
restriction of g to Ann gq, and it spans this kernel, because it is 1-dimensional. This 
proves the first two statements of the proposition. By Theorem 17.3 on p. 437, the 
restriction of q to any codimension-1 subspace TI’ = P(U) C T,Q complementary 
to p is nonsingular. Therefore, V = U @ U+, where dim Ut = 2 and the restriction 
q|y is nonsingular too. Since p € U+ is isotropic for q|y1, the subspace Ut C V 
is a hyperbolic plane. Hence, the dimension of the hyperbolic component of the 
form q|y is two less than the dimension of the hyperbolic component of q. Since 
QO’ = Z(q|v), the last statement of the proposition holds. Oo 


Corollary 17.8 Let Q C P, be a smooth quadric of planarity m and p € Q any 
point. Write Q' = QM P,~2 for the quadric of planarity (m — 1) cut out of Q by 
any codimension-1 subspace Py». C T,Q = Py—1 complementary to p. Then the m- 
dimensional subspaces L C Q passing through p are in bijection with the set of all 
(m — 1)-dimensional subspaces L' C Q’. Oo 


Example 17.9 (Subspaces on Smooth Quadrics over an Algebraically Closed Field) 
If k is algebraically closed, then the smooth quadrics Qyp C P, and Q; C P> both 
are 0-planar. The next two quadrics Q2 C P3 and Q3 C Py are 1-planar. For every 
p € Qy, there are exactly two lines £ C Q> passing through p. They join p with two 
points of some smooth quadric Qo C P; C T,Q2 ~ {p}. Note that this agrees with 
Example 17.6. The lines £ C Q3 passing through a given point p € Q3 join p with 
the points of some smooth conic Q; C P2 C T,Q4 ~ {p} and form a cone with its 
vertex at p. The next, 4-dimensional, smooth quadric Q4 C Ps, is 2-planar. Each 
plane x C Qs, passing through a given point p € Qy4 is spanned by p and some line 
on Segre’s quadric Q2 C P3 C T,Q4 ~ {p}. Thus, these planes split into two pencils 
corresponding to two line rulings of Q>. 


Example 17.10 (Subspaces on Real Smooth Quadrics) For k = R and every 
dimension n, the smooth elliptic quadric Q,,9 is O-planar and contains no lines. 
Each point p of the 1-planar quadric Q, is a vertex of the cone ruled by the 
lines joining p with the points of some smooth (n — 2)-dimensional elliptic quadric 
On—2,0 C Pn-1 C TpQni ~ {p}. All these lines lie on Q,,1, and there are no other 
lines on Q,,; passing through p. In particular, each point p on the real Segre’s quadric 
Q21 C P3 is an intersection point of two lines passing through two points of the 0- 
dimensional elliptic quadric Qo.9 C P; C T,Q21 ~ {p}. 
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17.4.3 Polarities 


Recall that we write g : V ~ V* for the correlation map* associated with the 
nonsingular symmetric bilinear form gq. It sends a vector v € V to the linear 
form gu : u +> q(u,v) on V and produces a linear projective isomorphism 
q: P(V) ~ P(V*) called the polar map (or just the polarity for short) associated 
with the smooth quadric Q = Z(q) C P(V). If we interpret P(V*) as the space of 
hyperplanes in P(V), then we can say that the polarity g sends a point a € P,, to 
the hyperplane II, C P,, described by the linear equation g(x,a) = 0 in x. The 
point a and the hyperplane II, are called the pole and polar of each other with 
respect to Q. If a ¢ Q, then I, cuts the apparent contour*° of Q viewed from a, i-e., 
TW, Q= {b¢Q|ae TQ}. Ifa € QO, then 1, = T,Q. Thus, the quadric Q is 
recovered from the polar map as the locus of all points lying on their own polars. 
Since the orthogonality condition g(a, b) = 0 is symmetric in a, b, the point a lies 
on the polar of the point b if and only if the point b lies on the polar of a. This claim 
is known as polar duality. Points a, b such that g(a,b) = 0 are called conjugate 
with respect to the quadric Q. 


Proposition 17.8 Let Q be a smooth quadric and a,b € Q two different points 
such that the line £ = (ab) intersects Q in two different points c, d. Then a, b are 
conjugate with respect to Q if and only if the pair {a,b} is harmonic*' to the pair 
{c, d} on the line £. 


Proof Let Q = Z(q) and write x = (Xo : x) for the homogeneous coordinates on £ 
in the basis c,d € £. In these coordinates, the restriction of g onto £ is given, up to 
proportionality, by the quadratic form q(x) = det(x, c) - det(x, d). 

Taking the polarization, we get a symmetric bilinear form in x = (Xo : x,) and 
y = 00: yi): 


1 
q(x, y) = 5 (det(x, c) - det(y, d) + det(y, c) - det(x, d)) . 
The orthogonality condition g(a, b) = 0 is equivalent to the relation 
det(a, c) - det(b, d) = — det(b, c) - det(a, d) , 


which says that [a, b,c,d] = —1. Oo 


Proposition 17.9 LetG C P,, bea smooth quadric with Gramian YT, and Q C P,, an 
arbitrary quadric with Gramian B in the same basis. Then the polar map P,, > P* 
provided by G sends Q to a quadric in PX with Gramian TBI! in the dual basis. 


35See Sect. 16.1 on p. 387. 
36See Corollary 17.7 on p. 436. 
37This means that the cross ratio satisfies [a, b,c, d] = —1 (see Sect. 11.6.3 on p. 274). 
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Proof In coordinates with respect to any dual bases in V and V%, the correlation 
@: V > V* takes a vector with coordinate row x to the covector with coordinate row 
y = xT. If the vector x is constrained by the relation xBx' = 0, then the substitution 
x <1 yl’! shows that the covector y = aI satisfies the relation y~'B(T!)'y’ = 
yl—!BI—!y! = 0 (we have used that I is symmetric and invertible). Conversely, if 
yl! BI'—!y! = 0, then for x = yI'—!, the relation xBx' = 0 holds. Oo 


Corollary 17.9 The tangent spaces of a smooth quadric S C P, form a smooth 
quadric S* C P*. The Gramians of S and S* in dual bases of P,, and P* are inverse 
to each other. 


Proof Apply Proposition 17.9 forG = Q = S. Oo 


Proposition 17.10 Over an infinite field, two nonempty smooth quadrics coincide 
if and only if their equations are proportional. 


Proof Let Z(q1) = Z(q2) in P(V). Then the two polarities 7,,q, : P(V) > P(V*) 
coincide at all points of the quadrics. 


Exercise 17.16 Check that over an infinite field, every nonempty smooth quadric 
in P,, contains n + 2 points such that no (n + 1) of them lie within a hyperplane. 


It follows from Exercise 17.16 and Theorem 11.1 on p. 270 that the correlation maps 
41,492 : V + V* are proportional. Therefore, the Gramians are proportional. Oo 


17.5 Affine Quadrics 


17.5.1 Projective Enhancement of Affine Quadrics 


Everywhere in this section we assume that the ground field k is infinite and chark 4 
2. Let V be a vector space and f € SV* a polynomial of degree 2, not necessarily 
homogeneous. An affine hypersurface** X = Z(f) = {v € V | f(v) = 03 C A(V) 
is called an affine quadric. Two affine quadrics X;,X2 C A(V) are called affinely 
equivalent if there exists an affine automorphism F : A(V) + A(V) such that 
F(X1) = Xo. 

Every affine quadric X = Z(f) C A(V) admits a projective enhancement 
provided by the projective closure construction from Sect. 11.3.2 on p.264. We 
putW = k@V, e = (1,0) € k @ V, and write x) € W* for the unique basis 
vector in Ann V such that xo(e9) = 1. We decompose the affine equation for X into 
homogeneous parts, f = fo + fi + fo, where fo € k, fi € V*, fp € S°V*, and put 


q=f=fo mth: wtheSw. (17.19) 


38See Sect. 11.2.4 on p. 262. 
39Or isomorphic. 
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Then g(e + v) = f(v) for all v € V. The projective quadric Q = Z(q) Cc P(W) 
is called the projective closure of X, and the quadratic form g € S?W* is called the 
extended quadratic form of X. The affine part of this quadric visible in the standard 
chart 


Up = Un = {WE W| mw) =Y=et+V 


can be identified with X, because in every affine coordinate system within Uo 
originating at eo, the quadric Qare = QM Up is given by the equation g(e + v) = 


f(v) = 0 in the vector v € V. We write 


Hoo & Ann(xy) = P(V) Cc P(W) 
for the hyperplane at infinity of the chart Up and Qo = QM Hcy for the infinitely 
distant part of Q. The projective quadric Qyo is called the asymptotic quadric of 
X. It is defined in P(V) = Hoo by the quadratic form gly = fo € S?V*, the 
leading quadratic form of X. Passing to projective enhancement transforms the 
affine classification of quadrics to the projective classification of pairs “projective 
quadric Q + hyperplane H,.” such that H ¢ QandQ ¢ H. 


Proposition 17.11 Let two nonempty affine quadrics X' = Z(f'), X” = Z(f") C 
A(V) have projective closures Q', Q" C P(W). Then X' and X” are affinely equiva- 
lent if and only if there exists a linear projective automorphism F : P(W) + P(W) 
such that F(Q’) = Q" and F(Hoo) = Hoo. 


Proof We identify X’ and X” with QO! = Q!M Up and QO", = QO” MN Up respectively. 
Let us show first that an affine automorphism g : Up = Up is the same as a projective 
automorphism F : P(W) + P(W) sending H,» to itself. By definition,*® every affine 
automorphism ¢ of an affine space Up = e + V maps 


e+tutft+Dgv Vue, (17.20) 
where f = g(e) € Uy and Dy : V & V, the differential of gy, is a linear 


automorphism by Proposition 6.7 on p.148. Let F : W => W be a linear 
automorphism of W = k @ V determined by the block matrix 


LO). 4 a 
(2) Cel) om 


where 1 € k, 0 = (0,...,0) € k",n = dimV. Clearly, F(V) = V, F(Uo) = Uo, 
and F|y, = y. Conversely, given an automorphism F : P(W) + P(W) sending P(V) 
to itself, choose a linear isomorphism F : W => W that induces F. Then the action 


49See Definition 6.5 on p. 148. 
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of F on W = k 9 Vis given by a block matrix of the form 


(ry) 

fw 

where w € k, 0 = (0,...,0) € k", f € V, and yw € End V. Since F is invertible, 
uu # Oand yw € GL(V). If we rescale F without changing F in order to get x. = 1, 
we obtain a map of the form (17.21), which sends Up to itself and induces there an 


affine automorphism (17.20). 
Now let F : P(W) = P(W) take Q’ to Q” and send Hq to itself. Then F maps 


Q’ , to OQ", and assigns an affine automorphism of Up. Conversely, if F(Q/-) = QO", 
then the projective quadrics F(Q’) and Q” coincide outside the projective hyperplane 
H,,. Then they coincide everywhere by Lemma 17.5 below. oO 


Lemma 17.5 Let H C P,, be a hyperplane and Q C P, a nonempty quadric such 
thatQ £ H andH ¢ Q. If the ground field is infinite, then Q \ H is uniquely 
determined by Q ~ H. 


Proof For n = 1, the statement is obvious from Corollary 17.5 on p. 435. Consider 
n = 2. If Q = V(q) is smooth, then the same arguments as in Proposition 17.10 on 
p. 444 allow us to find n + 2 points in Q ~~ H such that no n + 1 of them lie within a 
hyperplane. These n + 2 points completely determine the polarity 


q: P(V) > P(V*), 


which fixes the equation of Q up to proportionality. If Q is not smooth but has some 
smooth point a € Q~ H, let L 5 a be the projective subspace complementary to 
Sing Q. By Theorem 17.3 on p.437, Q is the linear join of Sing Q and a smooth 
quadric QO’ = QNL, which satisfies, together with the hyperplane H’ = HN L, 
the conditions of the lemma formulated within L. As we have seen already, Q’ M H’ 
is recovered from Q’ ~ H’. Further, Sing Q N H is recovered from Q ~ H, because 
each line (a,b) with b € Sing QM A lies in Q, and all points of this line except for 
b are in Q ~ H. Since both Q’ and Sing Q are recovered from Q ~ H, their linear 
join is recovered too. Finally, let all the points of Q x H be singular. Then Q has no 
smooth points in H as well, because otherwise, every line (ab) with smooth a € H 
and singular b € Q ~ FH lies on Q, and all its points except for a, b are smooth and 
lie outside H. Thus in this case, Q = Sing Q is a projective subspace not lying in H, 
and therefore Q M H is completely determined by Q ~ H. Oo 


The classification of affine quadrics breaks them into four classes: smooth cen- 
tral quadrics, paraboloids, simple cones, and cylinders in accordance with the 
smoothness singularity of their projective closures Q and the positional relationship 
between Q and the hyperplane at infinity Hao. 
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17.5.2. Smooth Central Quadrics 


An affine quadric X = V(f) is called smooth central if its projective closure Q 
is smooth and the hyperplane at infinity H. is not tangent to Q. In this case, the 
asymptotic quadric Ogg = QM Hoo is smooth by Proposition 17.7. In the language 
of equations, X is smooth central if and only if its extended quadratic form (17.19) 
and leading quadratic form f) both have nonzero Gram determinants. The epithet 
“central” is explained as follows. Write c €¢ P(W) for the pole of Hg, with respect 
to Q. Since Hg is not tangent to Q, c € Up ~ Qar. Given a line £ = (cd) joining 
c with some point d € Hg ~ Q and intersecting Q in points a,b € Qar, then by 
Proposition 17.8, the cross ratio [d, c, b, a] is equal to —1. This means*! that in the 
affine part Up N € = € ~d of this line, the point c is the midpoint of [a, b], ice., 
a central smooth affine quadric is centrally symmetric with respect to the pole of 
the hyperplane at infinity. For this reason, c is called the center of the quadric. In 
every affine coordinate system in Up, originating at c, the polynomial f that defines 
X = Qza has no linear term and is of the form f(x) = fo + fo(x), where fo 4 0. 


Exercise 17.17 Check this. 
In an orthogonal basis for fo in V, the affine equation for Qarr, on dividing both sides 
by fo, takes the form 
2 2 2_ 
QxX{ + oxy +++ + ayx, = 1. (17.22) 
Over an algebraically closed field, it can be simplified to 
xa +x5+ ee +x =1 


by rescaling the variables. Hence, all smooth central affine quadrics over an 
algebraically closed field are affinely equivalent. Over R, the equation (17.22) can 
be simplified to 


te +--+ = HL, wherep2m,pt+m=n, (17.23) 


and for p = m = n/2, only +1 is allowed* on the right-hand side. Among the 
quadrics (17.23) there is exactly one that is empty. It has the equation }~ a =-1 
and is called the imaginary ellipsoid. There is also exactly one quadric without 
points at infinity. It has the equation )°x? = 1 with m = 0 and is called the 
ellipsoid. 


41See Sect. 11.6.3 on p. 274. 


“For p = m = n/2, the equation (17.23) with —1 on the right-hand side is transformed to the 
same equation with +1 by changing the signs of both sides and renumbering the variables. 
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Exercise 17.18 Check that the ellipsoid is bounded and 0-planar. 


All other quadrics (17.23) have Qo. # @ and are called hyperboloids. For p > m 
and +1 on the right-hand side of (17.23), the projective quadric Q has signature 
(p,m + 1) and therefore is m-planar. For p > m and —1 on the right-hand side 
of (17.23), the signature of Q is (p,m), and Q is (m— 1)-planar. For p = m = n/2, 
the planarity of Q equals n/2. Thus, the 0-planar quadrics (17.23) are exhausted by 
the ellipsoid and hyperboloid of two sheets x} + +++ +39?_, =x -1. 


Exercise 17.19 Convince yourself that the hyperboloid of two sheets has two 
connected components, whereas all the other quadrics (17.23) are connected. 


The infinitely distant piece Ogg = QM Hoo of the quadric (17.23) is given within 
Hy = P(V) by the equation 


2 2.2 2 
fn i — 4m = 0 


and is m-planar regardless of the sign on the right-hand side of (17.23). Hence, there 
are no affinely equivalent quadrics among (17.23). 


17.5.3 Paraboloids 


An affine quadric X = V(f) is called a paraboloid if its projective closure Q is 
smooth and H,. = T.Q is tangent to Q at some point c € Qoo. In this case,** the 
asymptotic quadric Q45 = Hoo M Q has exactly one singular point, namely c. In 
terms of equations, X = V(f) is a paraboloid if and only if the Gram determinant 
of the extended quadratic (17.19) is nonzero, whereas the leading quadratic form f, 
has 1-dimensional kernel, and in this case, the kernel is spanned by the vector c. 

Since q is nondegenerate, the isotropic vector c is included in some pair b, c 
spanning the hyperbolic plane II C W. Then W = TI @ I+, where T+ C V = 
Ann xo, because V = ct. Since b ¢ Ann xo, it follows that xo(b) 4 0, and we can 
rescale b, c to a hyperbolic basis e9 = Ab, e, = jac of TI such that xo(e9) = 1. 
Write e), é2,...,@,—1 for an orthogonal basis in T+ and provide Up with an affine 
coordinate system originating at eg € Up with basis e), e2,...,é@, in V. The affine 
equation for X in this system is 


ax} + apXx5 Se Mint 3 = Xnis (17.24) 


See Sect. 17.7 on p. 441. 
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Over an algebraically closed field, it can be simplified to 
Bde $y Sm 


Thus, all paraboloids of a given dimension are affinely equivalent over an alge- 
braically closed field. Over R, the equation (17.24) can be simplified to 


w+ tee +X — yt Xo 4m =x,, wherep2>m,p+m=n-1._ (17.25) 


The paraboloid (17.25) is m-planar. Therefore, all paraboloids (17.25) are nonempty 
and mutually inequivalent. The zero-planar paraboloid x; + --- +x7_, = x, 1s called 
elliptic, and all the other paraboloids are called hyperbolic. 


17.5.4 Simple Cones 


An affine quadric X = V(f) is called a simple cone if its projective closure Q is 
singular and the asymptotic quadric Q.5 is smooth. 


Exercise 17.20 Convince yourself that for every singular quadric Q C P(W), the 
smoothness of Qo = QM Ho is equivalent to the emptiness of Sing QM Hoo. 


Since the hyperplane H,, does not intersect Sing Q, the latter subspace is 0- 
dimensional. Therefore, Q has exactly one singular point c, with c € Up, and X is 
ruled by the lines joining c with the points of the asymptotic quadric lying at infinity. 
In the language of equations, this means that the extended quadric (17.19) has a 1- 
dimensional kernel, whose generator c can be normalized such that xo(c) = 1. If 
we place the origin of the affine coordinate system in Up at c, then both terms fo, 
fi disappear from f, and X turns out to be given by a nondegenerate homogeneous 
quadratic form f. € S?V*. Passing to its orthogonal basis, we get the equation 


ax, + aoxy ++ tan = 0. (17.26) 
Over an algebraically closed field, it can be simplified to 
e+ e+e-+2=0. (17.27) 


Thus, over an algebraically closed field there is exactly one simple cone up to affine 
equivalence. Over R, we get a collection of cones 


Xie +25 =x + es +0 4m> where p > mandp+m=n. (17.28) 


The homogeneous equation (17.28) defines a projective quadric of planarity m—1 in 
P(V). Therefore, the affine cone (17.28) is m-planar. Hence, all simple cones (17.28) 
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are affinely inequivalent. Note that for m = 0, the real quadric (17.28) is exhausted 
by just one point, the origin. 


17.5.5 Cylinders 


An affine quadric X = V(f) is called a cylinder if both the projective closure Q and 
asymptotic quadric Og, are singular. By Exercise 17.20, this means that 


Sing ON Hoo # ©. 


In terms of equations, an affine quadric X is a cylinder if and only if both forms g 
and f2 are degenerate. Take a basis e, e2,...,@, of V such that the vectors e; with 
i > r form a basis in ker g M V. Then the last n — r coordinates disappear from the 
equation for X. Thus, X is a direct product of the affine space A” ” parallel to those 
coordinates and an affine quadric in A” without singularities at infinity. The latter 
belongs to one of the three groups described above. 


Example 17.11 (Real Affine Plane Curves of Second Order) The complete list of 
nonempty second-order “curves” in R? up to affine equivalence is as follows: 


o ellipse a + x = 1, asmooth central curve without points at infinity; 

hyperbola Ar - os = 1, a smooth central curve intersecting the line at infinity 

Xo = 0 in two distinct points (0: 1: 0), (0:0: 1); 

parabola ba = X), a smooth curve tangent to infinity at (0:0: 1); 

double point xt + x; = 0, a simple cone over a smooth empty quadric at infinity; 

CTOSS Xi — oe = 0, a simple cone over a smooth nonempty quadric at infinity; 

parallel lines x; = 1, a cylinder over two points in A! (that is, over a smooth 

nonempty 0-dimensional quadric); 

o double line rl = 0, a cylinder over a double point in A! (that is, over a singular 
0-dimensional quadric). 


fe} 


o0o00 0 


The three smooth curves ellipse, hyperbola, parabola are affine pieces of the same 
projective conic of signature (2, 1), the Veronese conic. 


Example 17.12 (Real Affine Quadratic Surfaces) The complete list of nonempty 
second-order “surfaces” in R? up to affine equivalence is twice as long. There are 
three smooth central surfaces: 


0 ellipsoid x} + x3 + A = |, a projective quadric of signature (3, 1) viewed in 
a chart whose infinite prime does not meet the quadric; the ellipsoid is compact 
and 0-planar (see Fig. 17.3); 

o hyperboloid of two sheets x; + x; = x; — 1, the same projective quadric of 
signature (3, 1) but viewed in a chart whose infinite prime crosses the quadric 
along a smooth conic; the hyperboloid of two sheets is 0-planar and has two 
connected components (see Fig. 17.4); 
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Fig. 17.3. Ellipsoid 
ytyt+yg=1 


Fig. 17.4 Hyperboloid of 
two sheets xf +25 =25—1 


° 


hyperboloid of one sheet x} +.x3 = x3, + 1, the Segre quadric, of signature (2, 2), 
viewed in a chart whose infinite prime crosses the quadric along a smooth conic; 
the hyperboloid of one sheet is 1-planar and has two line rulings (see Fig. 17.5). 


Also, there are two paraboloids: 


° 


elliptic paraboloid x} + x5 = x3, a projective quadric of signature (3, 1) viewed 
in a chart whose infinite prime touches the quadric at (0 : 0: 0: 1) and has no 
more intersections with it; the elliptic paraboloid is 0-planar (see Fig. 17.6); 

hyperbolic paraboloid x} —x3, = x3, the Segre quadric, of signature (2, 2), viewed 
in a chart whose infinite prime is tangent to the quadric at (0 : 0 : 0: 1) and 
intersects it along two lines x, = +x crossing at the point of tangency the 


hyperbolic paraboloid is 1-planar and has two line rulings (see Fig. 17.7) 
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Fig. 17.8 Elliptic cone 
HG 


Fig. 17.9 Hyperbolic 
cylinder xf — 2x3 = 1 


There are two simple cones over smooth projective real conics: 


o double point x2 + x2 + x2 = 0,acone over the empty conic; 
P I 2 3 pty 


0 elliptic cone xi — x5 = x}, a cone over a nonempty conic (see Fig. 17.8). 


Finally, there are seven cylinders over the nonempty second-order “curves” in R? 
listed in Example 17.11 above. They are given exactly by the same equations but 
now considered in R? instead of R?. The corresponding surfaces are called elliptic, 
hyperbolic, and parabolic cylinders, parallel planes, a double line, intersecting 
planes, and a double plane. Altogether, we have 14 affinely inequivalent affine 
quadrics. 


“4See Fig. 17.9. 
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Problems for Independent Solution to Chap. 17 


Problem 17.1 In the standard basis of R°, a symmetric bilinear form 6 has Gram 
matrix 


—-12 14 -5-3 8 
14-17 2 5-8 
—5 2-12 3 6 
=3 5 33 1 

8 -8 6 1-6 


(a) Find the rank and signature for the restriction of 6 to a subspace given by the 
linear equations 


2x; + 2x. -—34x3-4x4-7x5=0, 
xy — Xo +243 +244 +4x5=0. 


(b) Write an explicit equation for a hyperplane z such that the two lines spanned by 
the vectors u = (3,0, 2,3, 6) and w = (0,3, —11, —12, —18) are interchanged 
by the 6-orthogonal reflection in z. 

(c) Compute the 6-orthogonal projections of u and w on z. 


Problem 17.2 Is there a quadratic form on R’ with principal upper left minors 


(a) A, >0, A, =0, A3>0, Ay <0, As =0, Ao <0, A7 > 0; 

(b) A, > 0, A, =0, A3>0, Ay <0, As =0, Ao <0, A7 <0; 

(c) A, >0, A, =0, A3>0, Ay <0, As =0, Ao < 0, A7 <0? 
If such a form exists, compute its signature and give an explicit example of a 
Gramian. 

Problem 17.3 Find the rank and signature of the quadratic forms tr (A) and tr (AA‘) 
on Mat, (R). 

Problem 17.4 Expand the characteristic polynomial of the matrix X € Mat, (IR) as 


det(tE — X) = tf” — 0, (X) ft" | +. 09(X) 77 + +» + (1) det X. 


Convince yourself that o2(X) is a quadratic form on the vector space Mat, (IR) and 
find its rank and signature. To begin with, consider n = 2, 3, 4. 
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Problem 17.5 Convince yourself that A +» detA is a quadratic form on Mat) (k) 
with polarization det(A, B) = tr(ABY)/2, where BY is the adjunct matrix*> of B. 
Find the rank and signature of this form over k = R. Is it hyperbolic over 
k= F,? 

Problem 17.6 Write W for the space of binary quadratic forms in the variables 
(xo, x1). For a matrix A € GL»(k), consider the map 


Si W > W.f(x0,%1)  f ((0,¥1)- A). 


Check that it is linear and write its matrix in the basis x), 2xox1, x7. Express tr S°A 
and det S7A in terms of trA and det A. 


Problem 17.7 Consider the residue ring K = F3[x]/(x* —x + 1) as a 3-dimensional 
vector space over F3 equipped with a bilinear form tr(ab), the trace of the 
multiplication operator x +> abx on K. Write the Gramian of this form in the basis 
[1], [x], [x7]. Does K contain a hyperbolic plane? If so, write down a hyperbolic 
basis of this plane explicitly. If not, explain why. 


Problem 17.8 Show that the following properties of a vector space W with a 
nonsingular quadratic form are equivalent: (a) W is hyperbolic; (b) W is a direct 
sum*° of isotropic subspaces; (c) dim W is even and W contains some isotropic 
subspace of dimension dim W/2. 


Problem 17.9 Write the homogeneous equation of a smooth conic C C P» in the 
basis @9, €1, €2 such that the triangle ege;e2 is (a) autopolar with respect to C; 
(b) circumscribed about C; (c) inscribed in C. 


Problem 17.10 Given a circle C in the Euclidean plane R?, draw the pole of a given 
line and the polar of a given point under the polar map provided by C (a) using a 
straightedge and compass; (b) using only a straightedge . (Pay especial attention 
to cases in which the given line does not intersect C and the given point lies inside 
the disk bounded by C.) 


Problem 17.11 Using only a straightedge, draw the tangent line to a given conic C 
from a given point (a) on C; (b) outside C. 

Problem 17.12 How many common tangent lines can two different conics on P 
over an algebraically closed field have? 


Problem 17.13 Given five lines on P; with no three concurrent among them, how 
many conics touch all of them? 


Problem 17.14 Let the vertices a,b, c,d of the complete quadrangle from Exam- 
ple 11.9 on p. 274 lie on some smooth conic C. Show that the associated triangle 


45 See Sect. 9.6 on p. 220. 
46Summands are not assumed to be orthogonal. 
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xyz is autopolar with respect to C, meaning that its vertices are the poles of the 
opposite sides. 


Problem 17.15 Show that two triangles on P are perspective*’ if and only if they 
are polar to each other with respect to some smooth conic. 


Problem 17.16 Show that two triangles in P2 are circumscribed about a common 
conic if and only if they are inscribed in a common conic. 


Problem 17.17 (Cross Ratio on a Smooth Conic) For a quadruple a,b,c,d of 
distinct points on a smooth conic C, write [a, b,c, d] for the cross ratio of lines 
[(pa), (pb), (pe), (pd)] in the pencil of lines passing through an auxiliary point*® 
p € C. Show that: 


(a) [a, b,c, d] does not depend on’? p. 

(b) Two chords [a, b], [b,c] of C are conjugate’ with respect to C if and only if 
[a, b,c, d| = —-1. 

(c) In Example 17.5 on p. 437, the cross ratio of double points 


[{a, a}, {b, a}, {c, c}, td, d}] 
on the Veronese conic C C S?P, is equal to the cross ratio [a, b,c, d] on P;. 


Problem 17.18 (Segre’s Quadric) In the notation of Example 17.6: 


(a) Describe the polarity on P (Mat (k)) provided by the quadratic form det(X). 
(b) Show that the operators € ® v linearly span the vector space End(U). 
(c) Prove the equivalence of the following properties of an operator F € End(U): 


1. F € Trg Qs; 
2. F(Ann(é)) Ck- v; 
3.dneU*,weU:F=EQwt+ nv. 


(d) Verify that the projective automorphism F : P(U) + P(U) induced by an 
operetor F € End(U) ~ Q; acts on a point p = P Ann(&) € P(U) as follows. 
The line L’ = & x P), which lies in the first ruling family, and the point F span 
a plane in P; = P(End(U)). This plane intersects Segre’s quadric Qs in a pair 
of crossing lines L’ U L”, where L” = P¥ x v lies in the second ruling family. 
Then v = F(p). 


47See Problem 11.12 on p. 276. 


48Equivalently, we can project C from p onto a line £ and write [a, b, c, d] for the cross ratio of the 
projections. 


4° And on £, if we have used the projection. 
That is, a pole of the line (ab) lies on the line (c, d). 
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Problem 17.19 Given four nonintersecting lines in (a) P(C*); (b) A(C*); 
(c) P(R*); (d) A(R‘), how many lines intersect all of them? List all possible 
answers and indicate those that remain unchanged under small perturbations of 
the given lines. 


Problem 17.20 (Pliicker Quadric) Write V and W for the spaces of homo- 
geneous linear and quadratic Grassmannian polynomials in the four variables 


Siy Gay S30 84: 
(a) Show that a bilinear form p : W x W — k is well defined by 


@, A @2 = p(@1, @2) + & A &2 A &3 A &4 


and write its Gramian in the basis & = & A &,1<i<j <4. 

(b) Verify that over every ground field, p is symmetric and nondegenerate and find 
the signature of p over R. 

(c) Prove that w € W can be factored as a product of two linear forms if and only 
if p(w, @) = 0. 

(d) Check that the assignment (ab) aA b provides a well-defined bijection 
between the lines (ab) C P3 = P(V) and the points of Pliicker’s quadric P = 
Z(p) C Ps = P(W). 

(e) Check that two lines in P3 intersect if and only if their images in P C Ps are p- 
orthogonal. 

(f) Show that for every w = a A b € P, the intersection PM T,,P is formed by the 
images of all lines intersecting the line (ab). 

(g) Verify that the Pliicker quadric is ruled by two families of planes my, 7, 
indexed by II € P},a € P?, such that the plane zy consists of all lines in 
P3 lying within the plane II C P3, and the plane zr, consists of all lines in P3 
passing through the point a € P3 

(h) Check that every line on P is uniquely represented as rq M Za. 


Problem 17.21 Find the total number of points over the field?! Fo on (a) the conic 
XX] — X1xX2 + XoxX. = Oin Py; (b) the quadratic surface x + xf + x5 +.x3 = 0 
in P3. 

Problem 17.22 Indicate all ranks and signatures that may have a hyperplane section 
of a smooth real projective quadric of signature (p,m). 


5!Recall that the field Fy = F3[x]/(x? + 1) consists of nine elements a + ib, where a,b € Z/(3) 
and i? = —1 (mod 3); see Sect. 3.6.2 on p. 63. 


Chapter 18 
Real Versus Complex 


18.1 Realification 


18.1.1  Realification of a Complex Vector Space 


Let W be a vector space of dimension n over the complex number field C. Then 
W can be considered a vector space over the real subfield R Cc C as well. The 
resulting vector space over R is denoted by Wp and called the realification of the 
complex vector space W. For every basis ¢€),¢€2,...,@, of W over C, the vectors 
€1,€2,---,€n, le}, 1@2,..., 1, forma basis of Wp over R, because for every w € W, 
the uniqueness of the expansion 


w= a; + iy,)-e,, where (x, +iy,) €C, 
is equivalent to the uniqueness of the expansion 
w= ree -ey+ Y “yp + iey , where xy, yw ER. 


Therefore, dimgp Wr = 2dimc W. Note that the realification of a complex vector 
space is always even-dimensional. 


18.1.2 Comparison of Linear Groups 


Write Endc(W) for the C-algebra of all C-linear maps F : W — W and write 
Endg (Wa) for the R-algebra of all R-linear maps G : Wp > Wa. The first algebra is 
tautologically embedded in the second as an R-subalgebra Endc(W) C Endg(Wa). 
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Let us fix some basis €), é2,...,@, of W over C and the corresponding basis 
C1, €2,..-,€n, le}, 1€2,...,1€p (18.1) 


of We over R and identify Endg(Wr) with Matz, (IR) by writing the operators as 
(2n) x (2n) matrices in the basis (18.1). It is convenient to subdivide such a matrix 
into four n x n blocks 


AB 
G= t |) (18.2) 


in accordance with the subdivision of basis vectors (18.1) into two groups of n 
vectors {e,} and {ie,}. 


Proposition 18.1 (Cauchy—Riemann Relations) An operator G € Endp(WrR) 
with matrix (18.2) in the basis (18.1) lies in the subalgebra Endc(W) of Endp(Wp) 
if and only if C = B and D = —A. In this case, G has the complex n x n matrix 
A + iB in the basis e€), é2,...,@n of W over C. 


Proof The C-linearity of G means that G(iw) = iG(w) for all w € Wr. Since this 
relation is R-linear in w, it is enough to check it for the basis vectors (18.1) only. 
This forces C = B and D = —A. Conversely, multiplication by the complex matrix 
A + iB in W acts on the vectors (18.1) by the matrix 


(“na): 


Example 18.1 (Complex-Differentiable Functions) Let W = C. Then Wg = R?. 
The basis e = | of W over C produces the basis e = 1, ie = i of We over R. Every 
C linear operator F : C > C acts as w+ zw for some z = a+ ib € C, which is 
nothing but the 1 x 1 matrix of F in the basis e. In the basis 1, i of We over R, the 
operator F has real 2 x 2 matrix 
a—b 
(7). 


An arbitrary (not necessarily linear) map from C = R? to itself can be viewed 
either as one complex function w = f(z) of one complex variable z or as a pair of 
real functions of two real variables 


| u = u(x, y), 


oO 


(18.3) 
v = v(x, y), 
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related to w and z by the equalities w = u+iv,z=x+ iy. A functionf : C > C, 
ZH w = f(z), is called complex-differentiable at a point 77 = xo + iyo if its 
increment is approximated by the C-linear function of the increment of its argument, 
that is, if 


f(eo+ Az) =f(%o) + 6-Az+o(Az) (18.4) 


for some € € C. Similarly, a map R* — R? given by a pair of functions (18.3) 
is called real-differentiable if its vector increment is approximated by the R-linear 
operator acting on the vector increment of the argument, that is, if 


u(xot Ax, yot Ay)) _ (ul%o, yo) ab\ (Ax 
cen Ax, yo+ re - ae ul (: ‘) (3) Tones CS.) 


for some matrix 
(: ? € Mat x2(R) . 
cd 


It is not hard to check! that if the approximations (18.4) and (18.5) exist, then both 
linear operators” acting on the increments of the arguments are expressed through 
the derivatives of the functions in question as follows: 


df fot 2) — Feo) € ‘) _ (a ney 


= —(z) = lim 1 v 
$ ra 0) Ae>0 Az cd 5 (x0, Yo) 5 (xo, Yo) 


where 
u(xot+ Ax, yo) —f (x0, Yo) 
Ax , 


u(xo, ot Ay) — fo, Yo) 
Ay 


ene 
—(xo, yo) = lim 
Ox On 20 Ax—>0 


’ 


ue a: 
—(xo,yo) = lim 
dy mee Ay0 


etc. We conclude from Proposition 18.1 that a pair of real differentiable func- 
tions (18.3) defines a complex-differentiable map C — C if and only if these 
functions satisfy the differential Cauchy—Riemann equations 


du dv d du dv 
— => — an = : 
ox oy dy ox 


‘See any calculus textbook. 


> They are called differentials of the function in question. 
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18.2 Complexification 


18.2.1 Complexification of a Real Vector Space 


Every vector space V of dimension n over the field R admits a canonical extension 
to a vector space of dimension n over C called the complexification of V and denoted 
by Vc or C @ V. This complex vector space contains the initial real space V in the 
same manner as the 1-dimensional complex space C contains the 1-dimensional real 
space R C C. An explicit construction of Vc is as follows. 

Consider one more copy of the space V and denote it by iV to distinguish it from 
the original. Vectors in the space iV are also denoted by iv. Thus, we have two 
exemplars of every vector v € V: the original v and its copy iv € iV. Take the direct 
sum of these spaces and denote it by 


Ve=V iV. (18.6) 


This is a vector space over the field R of dimension 2 dimg V. By construction, it 
consists of vectors w = v, + iv, and the equality v; + iv2 = wy + iw? in Vc is 
equivalent to the pair of equalities vy; = w), v2 = wW2 in V. Define multiplication by 
a complex number z = x + iy € C in Vc by 


(x + iy): (v; + iv2) & (xv; — yur) + iu; +: xv2) EV @IV. (18.7) 


Exercise 18.1 Verify that the multiplication given in (18.7) provides Vc with the 
structure of a vector space over the field C. 


Note that every basis e), €2,...,@, of V over R remains a basis of Vc over C as 
well, because the existence and uniqueness of expressions of vectors v; € V and 
iv2 € iV respectively in terms of the basis vectors e, € V and ie, € iV with real 
coefficients x), y, € R, 

Vy = Xe] + X2@2 ++++ + Xnen, 
V2 = yer + 22 +++ + Ynen, 
means exactly the same as the existence and uniqueness of the expansion of a vector 


w = v; + iv2 in Vc in terms of the vectors e, with complex coefficients z, = 
x +iy, €C: 


W = Z%1€1 + 22€2 +++ + Znen- 


Therefore, dimc Vc = dime V. 
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18.2.2. Complex Conjugation 


A complexified real vector space Vc is equipped with the R-linear involution 
0:Ve> Ve, w=, timr wey — ivr, 


called complex conjugation and behaving much like complex conjugation of 
complex numbers within C. Clearly, c? = Idy, and the R-subspaces V, iV are 
the +1-eigenspaces of o. The first, which consists of the +1-eigenvectors of o, is 
called real, whereas the second, consisting of the —1-eigenvectors of o, is called 
pure imaginary. Thus, the decomposition Vc = V © iV can be thought of as a 
diagonalization of o. Let me stress that o is not C-linear: o(zw) = Zo(w) for all 
z € Candall w € Vc. Every R-linear map between complex vector spaces satisfying 
this property is called semilinear or C-antilinear. 


18.2.3 Complexification of Linear Maps 


Every R-linear map F : V’ > V” between vector spaces over R can be extended to 
a C-linear map between their complexifications: 


Fe: Vo7>Vé, v1 + iv2 + F(v1) + iF (v2). (18.8) 


The operator Fc is called the complexification of the R-linear operator F. 


Exercise 18.2 Verify that Fc is C-linear, i.e., Fc(zw) = zFc(w) for all z € C and 
all w € Vo. 


It follows from (18.8) that a complexified operator commutes with complex 
conjugation of vectors: 


Few) = Fc(w) Vwe Ve. (18.9) 


In every basis of Vc formed by real vectors e;,e@2,...,@, € V, the matrices of 
F and Fc coincide and are real. In particular, F and Fc have equal characteristic 
polynomials with real coefficients. The collections of elementary divisors €@(F) and 
E((Fc) are related by the following proposition. 


Proposition 18.2 Every elementary divisor (t — A)” € EL(F), where 0 € R, is 
simultaneously an elementary divisor for Fc. Every elementary divisor p(t) € 
&L(F) with monic quadratic trinomial 


p(t) = ? —2tReA + |Al? = #-A)(t—A) E Rif 


irreducible over R splits into a pair of elementary divisors (t — AY", (t — AY" € 


(Fc). 
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Proof The complexification of the real vector space R[#]/(p”) is the complex vector 
space C[f]/(p”). The multiplication-by-t operator in the first space is complexified 
to the multiplication-by-t operator in the second. If p(t) = (t—A)” for some A € R, 
then C[t]/(p™) remains an indecomposable C[#|-module of p-torsion. If 


p(t) = (t—A)(t—A) 
for nonreal A € C, then the C[#|-module 


Clq/(p") = Clq/(— Ay") & (—2)") 


splits into the direct sum of two indecomposable submodules. oO 


18.2.4 Complex Eigenvectors 


Since the field C is algebraically closed, for every R-linear operator F : V > V, 
the complexified operator Fc : Vc — Vc has nonempty spectrum. For a nonreal 
eigenvalue A = a+ib = o-(cosg+ising) € Spec Fc. R and nonzero eigenvector 
w = vy + iva € Ve such that Fco(Aw) = AFc(w), we have F(v,) + iF (v2) = 
Fe(v + iv2) = (a+ ib)(v; + iv2) = (av; — bu2) + i(bv; + av2). This means that 
the R-linear subspace U C V spanned by v1, v2 is F-invariant and the matrix of F|y 
in the generators v1, v2 is equal to 


( a f = ( cos 2) (18.10) 
—ba — sing cos@ 


Note that we have obtained another proof of Proposition 15.1 on p. 366, which states 
that every R-linear operator possesses an invariant subspace of dimension | or 2. 

Since the matrix (18.10) has determinant |A|? # 0, the vectors vj, v2 are linearly 
independent and form a basis of U over R and of C @ U over C. Another basis of 
C@U over C is formed by the nonreal complex conjugate eigenvectors w = v1 +iv2 
and W = v; — iv2 with conjugate eigenvalues A = a + ib and A = a— ib. 


Exercise 18.3 For every R-linear operator F : V > V, check that w = v; + iv2 € 
Vc is an eigenvector of Fc with eigenvalue A if and only if W = v; — iv2 is an 
eigenvector of Fc with eigenvalue 1. 


For v; # 0, the R-linear span of the vectors w, W intersects the R-linear span of 
the real vectors v;, v2 along the 1-dimensional (over R) subspace generated by v. 
For v; = 0, these two real planes are transversal. The matrix of Fc in the basis w, 
Ww is diagonal with conjugate eigenvalues 1, A on the main diagonal. 

If the eigenvalue A € Spec F¢ is real, then A € Spec F as well, and the A- 
eigenspace of Fc coincides with the complexification of the A-eigenspace of F in V: 


Wi = {we Ve | Fe(w) = Aw} = C@{vEV| Flv) =AvJ=C@NK, 
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because the real and imaginary parts of every complex eigenvector w = v1 + iv2 € 
W, lie in V,, and conversely, every C-linear combination of real A-eigenvectors is a 
A-eigenvector as well. 


18.2.5 Complexification of the Dual Space 


For every real vector space V, there are three complex vector spaces naturally 
associated with the space V* = Homp(V,R) dual to V over R. The first is its 
complexification C ® V*. The second is Homc(Vc, C), the space dual over C to 
the complexified space Ve = C ® V. The third is Homp(V, C), the space of R- 
linear maps V > C, where multiplication by complex numbers is given by the rule 
A4-g: vt d- gv). In fact, these three complex vector spaces are canonically 
isomorphic to each other. Indeed, each R-linear map g : V > C can be written as 
v b> gi(v) + igo(v), where gy; = Rey : V > Rand g = Img: V > R both 
are R-linear. The complexified dual space C ® V* consists of elements g; + ig2, 
$1, 2 € V*, which are multiplied by complex numbers in the same way as with the 
previous maps V — C. Every C-linear form yw : Vc — C lying in the complex 
space dual to Vc yields w(v; + ivz) = w(v1) + iw(v2), because of the C-linearity. 
Thus, w is uniquely determined by the restriction g = w|y : V — C to the real 
subspace. The latter is an arbitrary R-linear form. 


18.2.6 Complexification of a Bilinear Form 


Every R-bilinear form 8 : V x V > R ona real vector space V can be extended to 
a C-bilinear form Bc : Ve x Ve — C on the complexified space Vc by 


Bc(ui + iz, v1 + ivr) = (B(u1, v1) — B(u2, v2)) + i (B(u1, v2) + B(u2, 01). 


The form f¢ is called the complexification of the bilinear form f. 


Exercise 18.4 Convince yourself that the identification Ve ~ C @ V* described 
above takes both the left and right correlation maps Ve — V@ provided by 
the complexified bilinear form Bc to the complexifications of the left and right 
correlation maps V > V* provided by f. 


The Gram matrix of Bc in a real basis of Vc over C coincides with the Gramian of 8 
in the same basis of V over R. If B is (skew) symmetric, then fc is (skew) symmetric 
as well. Note that a specific real invariant of a symmetric form, the signature, com- 
pletely disappears after complexification. For example, all nondegenerate symmetric 
forms become equivalent over C and produce the same smooth complex projective 
quadric Q C P(Vc) of planarity [dimg V/2]. The real projective quadric given by 
the same form in P(V) can be recovered as the fixed-point set of the action of the 
complex conjugation map on Q. 
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18.3 Real Structures 


Let W be an arbitrary vector space over C. Every R-linear C-antilinear operator 
o : Wr —> Wr such that o? = Idy is called a real structure® on the complex vector 
space W. Every vector space W over C equipped with a real structure o splits into a 
direct sum of +1 eigenspaces: Wp = V+ @ V_, where both 


Vi = ker(o —Id) = im(o +Id) and V_=ker(o + Id) = im(o — Id) 


are vector spaces over R only. Since o is C-antilinear, multiplication by i and —i 
within W assigns R-linear isomorphisms 


V4? 14 
Vie 
—1U_H U_ 
inverse to each other. Indeed, ve € Vi => o(v4) = Ve > oOlivi) = 
—io(v-) = -—iv, => ivy € V_, and similarly v. € Vl => oa(v_) = 
—v- => o(-iv_) = io(v_) = -—iv_ => —iv_ € V4. Therefore, every complex 


vector space equipped with a real structure o is canonically identified with the 
complexification of the + 1-eigensubspace V+ of o,i.e.,W = Vi@V_ = Vi iV, 
and multiplication by complex numbers within W proceeds exactly as given in 
formula (18.7) on p. 462. 

Let me stress that a real structure is a supplementary quality of a complex vector 
space. It is not supplied just by the definition. An arbitrary complex vector space, say 
one-dimensional, spanned by some abstract vector e, provides no way to say what 
constitutes the real and pure imaginary parts of e. For example, you can declare 
them to be w - e and iw - e, where w = (—1 + iV/3)/2. If a complex vector space 
W is equipped with a real structure o, this means precisely that W = C ® V is the 
complexification of the real vector space V = ker(o — Id). An abstract complex 
vector space can be forcedly equipped with a number of real structures leading to 
different subspaces of real vectors, and there is no a priori preferred one among 
them. 


Example 18.2 (Hermitian Adjoint of Matrices) A real structure on the space of 
square complex matrices W = Mat, (C) is given by Hermitian conjugation 


0; : AK AHA’, 


which takes a matrix to its conjugate transpose. 


3Or complex conjugation. 
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Exercise 18.5 Check that o; is an R-linear C-antilinear involution. 


The real subspace of Hermitian adjunction consists of Hermitian matrices 
Mat! & £4 € Mat,(C) | A = At}. 


This is an n?-dimensional vector space over the field IR. Pure imaginary matrices 
with respect to Hermitian adjunction are called skew-Hermitian.* They also form an 
n>-dimensional vector space 


Mat! £ £4 € Mat,(C) | A = —A‘} 


over R. For example, the Hermitian and skew-Hermitian 2 x 2 matrices look like 


this: 
a b+ic ia b+ic 
. and : : , 
b—-ic d —b+ic id 


where a, b, c, d € R. Note that the main diagonal is real for Hermitian matrices and 
is pure imaginary for skew-Hermitian ones. Multiplication by i takes a Hermitian 
matrix to a skew-Hermitian matrix and conversely. An arbitrary matrix is a sum of 
Hermitian and skew-Hermitian parts: 


A = Ay + Agy, where Ay = (A+ A‘)/2, Agy = (A—A')/2. 


18.4 Complex Structures 


If a vector space V over the field R comes as the realification of a vector space W 
over C, then V = Wg inherits a distinguished R-linear automorphism J : V > V, 
v + iv, provided by multiplication by i € C within W. Clearly, 1? = —Idy. Given 
an arbitrary vector space V over R, then every R-linear operator ] : V — V such 
that J? = —Idy is called a complex structure on V, because it allows one to define 
multiplication of vectors v € V by complex numbers as follows: 


def 


(x+iy):vSx-u+ty-Iv Vx+iyveC,Vvev. (18.11) 


Exercise 18.6 Verify that (18.11) provides V with the structure of a vector space 
over the field C. 


We write V; for V considered as a vector space over C with multiplication of 
vectors by complex numbers defined by the formula (18.11). Note that the initial 


4Or anti-Hermitian. 
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real vector space V is tautologically the realification of V;. In particular, this forces 
dim V to be even. 

Now let us make the verification asked in Exercise 18.6 visible without compu- 
tations. Since the operator J is annihilated by the polynomial 


P+1=¢+d)C-), 


the complexified operator Ic : Vc — Vc is diagonalizable’ and has eigenvalues 
among ti. Therefore, Vc splits into a direct sum of +i-eigenspaces 


Us = ker(Ic F¥ ildy,) 


if both are nonzero. By formula (18.9) on p.463, the relations Jcw = iw and 
Icw = —iw are conjugate to each other and therefore equivalent. In other words, 
complex conjugation of vectors establishes an R-linear C-antilinear isomorphism 
U, = U_ between the complex +i-eigenspaces of Jc, that is, UL = U4. In 
particular, both eigenspaces are nonempty and have the same dimension dime Uy = 
dime UL = n. Therefore, Vc = Us ® Ux as a vector space over C and 
dimg V = dime Vc = 2 dime U+ = 2n. Taking the real part of a vector establishes 
an R-linear isomorphism between the complex vector space U and the initial real 
vector space V: 


Re: U,V, whe Re(w) = =, (18.12) 
because U; NM kerRe C Uy MU; = O and dimgV = dimpgU 4. The 


isomorphism (18.12) allows us to transfer from U, to V the multiplication by 
complex numbers initially residing in Ui. We put A - Re(u) “ Re(Au) for all 
v = Re(u) € V. This certainly makes a vector space over C from V. Since iu = Icu 
for allu € U; and Re eJc = Jo Re, we get for every v = Re(u) € Vandx+iy € C 
the formula 


II 


(x + iy): v = (x + iy)- Re(u) = Re(xu + yiu) = xRe(u) + yRe(iu) 


II 


xRe(u) + yReUcu) = xRe(u) + yl(Reu) = xv + yl(v), 


which agrees with (18.11). Let us summarize all these observations in a single claim. 


Proposition 18.3 For every vector space V over R, the following data are in 
canonical bijection: 


(1) multiplication of vectors by complex numbers C x V — V that makes a vector 
space over C from V and forces V to be the realification of this complex vector 
space, 

(2) the R-linear operator I: V — V such that ? = —Idy, 


>See Proposition 15.6 on p. 373. 
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(3) the complex vector subspace U C Vc such that Ve = U ® U. 


The correspondence (1) — (2) sends multiplication C x V — V to the 
multiplication-by-i operator I : v ++ iv. The correspondence (2) — (3) sends 
the operator I to the (+i)-eigenspace U = Ux C Vc of the complexified operator 
Ic : Ve — Ve. The correspondence (3) — (1) transfers the complex vector space 
structure on U from U to V by means of the R-linear isomorphism Re : U = V, 
wh (w+w)/2. Oo 


18.5 Hermitian Enhancement of Euclidean Structure 


18.5.1 Hermitian Structure 


For purposes of metric geometry, which concerns lengths, angles, and areas rather 
than incidence relations between figures and equations, in the world of complex 
vector spaces more important than the symmetric inner products are the Hermitian 
symmetric® inner products (* ,* ) : Wx W > C, which satisfy the relation 


(u,w) = (w,u). 


Hermitian symmetry forces the inner product of a vector with itself to be real: 
(w, w) = (w, w) € R. Under the assumption that the inner product of every nonzero 
vector with itself is positive, the presence of the Hermitian inner product allows 
one to develop metric geometry in a complex vector space quite similarly to what 
we did in Chap. 10 for real Euclidean vector spaces. We will go into the Hermitian 
geometry and metric invariants of linear maps in the next chapter. The remainder of 
the current chapter will be devoted to algebraic properties inherent in the Hermitian 
inner product itself. 

If a conjugate symmetric product (u, w) is C-linear in the first argument u, then 
it has to be C-antilinear in the second argument w, because (u, zw) = (zw,v) = 
z(w, v) = Zw, v) = Z(v, w). Every R-bilinear form W x W — C that is C-linear in 
the first argument and C-antilinear in the second is called C-sesquilinear. 


Definition 18.1 (Hermitian Structure) A complex vector space W equipped with 
a Hermitian symmetric positive R-bilinear C-sesquilinear form 


(*,*):WxWoc 


Or conjugate symmetric. 
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is called Hermitian. These conditions mean that Vu,w € W,VzeEC, 


(u, w) = (w, u) (Hermitian symmetry), 
(zu, w) = z(u,w) = (u,Zw) — (sesquilinearity), (18.13) 
(w,w) >0 forw 40 (positivity). 


Every inner product with such properties is called a Hermitian structure on W. 


Example 18.3 (Coordinate Space) The standard Hermitian structure on the coor- 
dinate vector space C” is given by 


(u, w) = uyWy + UW. + +++ + UnWy (18.14) 


for u = (uj, U2,...,Un), W = (Wi, W2,..., Wn). Note that (w, w) = >- |wi|? > 0 for 
wH0. 


Example 18.4 (Space of Integrable Functions) The infinite-dimensional analogue 
of (18.14) is the standard Hermitian structure on the space of continuous functions 
[a,b] > C defined by 


b 
i2= i f(x) g(x) dx, (18.15) 


where the integral of a complex-valued function h(x) = u(x) + iv(x) with u,v : 
[a, b] > R is defined as 


b b 


[rav= fut) + oeoy art fucyact- f veyae. 


a a a 


Of course, the domain of integration and the notion of integral can be varied here 
if the positivity condition from (18.13) holds for the class of integrable functions 
being considered. 


Example 18.5 (Hermitian Complexification of Euclidean Space) Let V be a real 
Euclidean vector space with inner product (* ,* ) : V x V > R. Then the standard 
Hermitian extension of this product to the complexified space W = Vc is given by 
the following prescription forced by sesquilinearity: 


(uy, + ivi, uo + ive) = ((u1, ur) + (v1, v2)) + i (ur, v2) — (Vi,u2)). (18.16) 


Note that the real and imaginary parts on the right-hand side are respectively the 
symmetric and skew-symmetric R-bilinear forms. Also note that the Hermitian 
forms in Example 18.3 and Example 18.4 are exactly the standard Hermitian 
extensions of the standard Euclidean structures considered in Example 10.1 and 
Example 10.2 on p. 230. 
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18.5.2 Kdhler Triples 


The Hermitian structure (*,*) on a complex vector space W assigns three 
geometric structures at once on the realification We of W. These are 


o a Euclidean structure g : Wr x Wr > R provided by g(u, w) © Re(u, w); 
o asymplectic form’ w : Wr x Wr > R, w(u,w) © Im(u, w); 
o acomplex structure J : w+ > iw. 


Indeed, the Hermitian symmetry condition (u,w) = (w,u) forces the real and 
imaginary parts of the product (u, w) = g(u, w) + iw(u, w) to satisfy the relations 


g(u,w) = g(w,u) and w(u,w) = —a(w,u). 


The positivity of the Hermitian inner product implies the inequality g(v,v) = 
(v,v) > O for all v ¥ O. In particular, g is nondegenerate. The sesquilinearity 
condition (u, iw) = —i(u, w) forces 


g(u,lw) = @(u,w) and a@(u,lw) = —g(u,w). 


In terms of matrices, this means that the Gramians G, Q of the R-bilinear forms g, w 
and matrix J of the complex structure operator are related by the equality GJ = Q, 
which allows us to recover any one element of the triple (J, g,@) from the other 
two. In particular, it shows that Q is nondegenerate. Since (iu,iw) = (u,w) by 
sesquilinearity, we conclude that 


g(u, Iw) = g(u,w) and w@(lu,lw) = o(u,w). 


In other words, the complex structure operator J] € Og(Wr) M Sp,,(Wr) is 
simultaneously isometric for both forms g, w. 


Definition 18.2 (Kahler Triple) Let V be an even-dimensional real vector space 
equipped with the data set (/, g, @) consisting of the complex structure]: V > V, 
Euclidean structure g : V x V > R, and symplectic form w : V x V > R. Write V; 
for the complex vector space constructed from V by means of the complex structure 
I as in Proposition 18.3 on p.468. The data set (7, g, @) is called a Kahler triple if 


the prescription (u, w) = g(u, w) + i@(u, w) provides V; with a Hermitian structure. 


7See Sect. 16.6 on p. 411. 
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18.5.3 Completing Kahler Triples for a Given Euclidean 
Structure 


Let V be real Euclidean vector space. Write the Euclidean structure on V as 
g:VxVoR 


and let Vc be the complexification of V, and gc : Vc x Ve — C the C- 
bilinear complexification® of g. The next proposition describes all Kihler triples 
that complete g up to a Hermitian structure on V. 


Proposition 18.4 (Hermitian Completions of a Given Euclidean Structure) For 
every (even-dimensional) real Euclidean vector space (V, g), the following data are 
in canonical bijection: 


(1) the Kéhler triple (I, g,@), 
(2) the Euclidean isometry I € O,(V) such that P = —Idy, 
(3) the complex vector subspace U C V¢ that is maximal isotropic for gc. 


They are related in the same way as in Proposition 18.3 on p. 468. In particular, the 
decomposition Ve = U ® U automatically holds in (3). 


Proof As we have seen above, every complex structure J: V — V extending g 
to a Kahler triple must be g-orthogonal. Conversely, given some complex structure 
I € O,(V), the R-bilinear form g(v,/w) is nondegenerate and skew-symmetric, 
because g9(v, Iw) = g(Iv, Pw) = —g(Iv, w) = —g(w, Iv). Therefore, the C-valued 
inner product (v, w) = g(v, w)—ig(v, Iw) on V is conjugate-symmetric and positive. 
Since it is C-linear in the first argument, 


(Iu, w) = g(Iu, w) + ig(u, w) =i (g(u, w) — ig(lu, w)) =i (g(u, w) + ig(u, Iw)) = itu, w), 


it must be C-antilinear in the second. Thus, (1) and (2) are in bijection with each 
other. 

Let us show that the correspondence (2) — (3) from Proposition 18.3 produces 
the gc-isotropic (+/)-eigenspace U C Vc of the complexified operator 


Ic: Ve > Ve if 1 € Og(V). 
In a real basis of Vc, the orthogonality condition ‘GI = G on IJ forces Ic to be 
gc-orthogonal. Hence for every u € U such that Icu = iu, we get the equalities 


gc(u,u) = gc(Icu, Icu) = gc(iu, iu) = —gc(u, u). Therefore, gc(u, u) = 0 for all 
(+i)-eigenvectors u of Ic, as required. 


8See Sect. 18.2.6 on p. 465. 
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Now let us show that every gc-isotropic subspace U C Vc has zero intersection 
with its conjugate U. If wu; = % for some uw), u2 € U, then uw; + uw € U is real 
and gc-isotropic. Since gc has no real isotropic vectors,’ this forces uy) = —up. 
Therefore, 7) = —u% = —u, is pure imaginary, that is, vu; = iv for some v € V. 
Then 0 = gc(u1, u,) = —g(v, v) forces v = 0. 

Hence, for every maximal gc-isotropic subspace U C Vc of dimension 
dimc U = dime V, we have the direct sum decomposition Vc = U@® U. To recover 
(2) from (3), it remains to show that every operator /c acting on U and U respectively 
as multiplication by +i and —i is gc-orthogonal. For every u = v, + iv2 € U with 
v1, U2 € V, we have 


gc(U + ive, vy + ive) = g(v1, v1) — (v2, v2) — 2i g(v1, v2) 
= g(U1, U1) — g(v2, v2) + 2ig(v1, v2) = gc(vi + iv2, v1 + iv2) = 0. 
Hence, U also is gc-isotropic. Therefore 


&c(U1 + M2, UW + U2) = gc(u1, M2) + gc(H, U2) = gc(im, —i2) + gc(—im, iu2) 


= gc(Ic(ui + %), Ic(ui + %)), 
as required. Oo 


Example 18.6 (Hermitian Structures on IR*) Let g be the standard Euclidean 
structure on R*. Its Hermitian enhancement turns Euclidean R* to Hermitian C2, 
preserving the inner product of every vector with itself. By Proposition 18.4, such 
enhancements (/,g,q@) are in natural bijection with the 2-dimensional isotropic 
subspaces of the nondegenerate symmetric C-bilinear form on C @ R* = C’%, that 
is, with the lines on the Segre quadric'® in P; = P(C*). Thus, the Euclidean space 
R* admits two disjoint pencils of Hermitian enhancements. Their explicit geometric 
description will be given in Sect. 20.2.3 on p. 512 below. 


18.6 Hermitian Enhancement of Symplectic Structure 


18.6.1 Completing Kahler Triples for a Given Symplectic 
Structure 


Let a : Vx V — R be a nondegenerate skew-symmetric R-bilinear form on an 
(even-dimensional) vector space V over R. Write Vc for the complexification of V 
and we : Ve X Vc > C for the C-bilinear complexification!! of w. Besides wc, we 


Because gcly = g is positive anisotropic. 
'0See Example 17.6 on p. 439. 
‘See Sect. 18.2.6 on p. 465. 
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will also consider the C-sesquilinear extension of w to Vc defined by 


oOn(uy + iwi, Ur + iwr) = @c(uy + iwi, U2 — ir) 
(18.17) 
= (@(u1, U2) + (1, W2)) — i(@(U1, W2) + @(u2, 1) 


for all uw, u2, wi, w2 € V. Note that the wy-inner product of every vector with itself 
is pure imaginary: 


Ou(u + iw,u+ iw) = —2ia(u,w). 


The next proposition describes all Kahler triples completing w up to a Hermitian 
structure on V. 


Proposition 18.5 (Hermitian Completions of a Given Symplectic Structure) 
For every (even-dimensional) real symplectic vector space (V,q@), the following 
data are in canonical bijection: 


(1) the Kahler triple (1, g, w); 
(2) the symplectic isometry I € Sp,,(V) satisfying the following two conditions: 
(2a) 2 = —Idy 
(2b) the quadratic form G(v) = — w(v, Iv) € S°V* is positive anisotropic, 
(3 


wm 


the complex vector subspace U C Vc that is the Lagrangian for the symplectic 
C-bilinear form we and is Hermitian with respect to the C-sesquilinear form 
iy. 


They are related in the same way as in Proposition 18.3 on p. 468. 


Proof The transfers between (1) and (2) are verified exactly as in Proposition 18.4. 
Namely, every Kahler triple (/,g,w) has I € Sp,,(V) and positively defined 
g(v,v) = —a(v,lv), as we know. Conversely, for every symplectic complex 
structure J € Sp,,(V), 1? = —Idy, the R-bilinear form 


gi: VxVOR, e(u,w)* —ol(u,Iv), 


is nondegenerate and symmetric: w(u, Jw) = (lu, ?w) = —o(lu, w) = o(w, Iu). 
Therefore, the C-valued form V x V > C defined by 


(u, w) & g(u, v) + iw(u, w) = —@(u, Iw) + iw(u, w) (18.18) 
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is conjugate-symmetric. Since it is C-linear in the first argument, 


(lu, w) = —@(u, w) + iw(u, w) = i (@Uu, w) + i@(u, w)) 
= i (—a(u, Iw) + iw(u, w)) = i(u, w) , 


it is forced to be C-antilinear in the second and therefore C-sesquilinear altogether. 
It is positive if and only if (v, v) = —@(v, Jv) > 0 for all nonzero v € V. Therefore, 
(1) and (2) are in bijection with each other. 

The passage (2) — (3) described in Proposition 18.3 leads to the Lagrangian 
(+i)-eigenspace U C Ve of Ic, because the symplectic operator J € Sp,,(V) 
has symplectic complexification Ic € Sp,,.(Vc), and hence for all +i-eigenvectors 
W1, W2 € U, we have the equalities 


Oc(W1,W2) = @cUcw1, Icw2) = @c(iw1, iw2) = —@c(W1, W2), 


forcing wc(w1, w2) = 0. Since for all u + iv € U with u,v € V we have Ju = —v, 
the restriction of the sesquilinear form iwy to U looks like this: 


imy (uy + iv1, U2 + iv2) = W(Uy, V2) — (Vj, v2) + i (@(U1, U2) + @(U1, V2)) 
= —o(uy, Iu) + @(Iuy, uz) + i (@(U1, U2) + UU}, Iu2)) 
= —2a(u1, Iu) + 2iw(uy, uz). 


In other words, for every w1,w2 € U, we have the following remarkable coinci- 
dence: 


i@y(W1, W2) = 2g(Rew), Rew2) + 2iw@(Re w;, Re w2) = 2 (Rew,, Rew), 


where the rightmost term is the inner product (18.18). Hence this product is positive 
on V if and only if the sesquilinear form imy is positive on U. In particular, every 
complex structure satisfying (2) has +i-eigenspace U satisfying (3). 

Conversely, for every Lagrangian subspace U C Vc such that the real quadratic 
form imy(w, wv) = iwc(w,¥) is positive on U, the intersection UN U is equal to 
zero, because for every u;,u2 € U such that uw; = %, we have 0 = iwc(u2, u,) = 
iwc (uz, U2) = iwy(u2, U2), Which forces uz = 0. Exactly as in Proposition 18.4, the 
conjugate space U is also Lagrangian for wc, because for every W12 = uy.2+1012 € 
U with u12, V1.2 € V, we have 


Oc (W 1, W2) = Wc (uy — iv}, Uy —iv2) = @c(uy + iv}, U2 + ivr) = wc(W1, W2) = 0. 


It remains to check that the symplectic form wc is preserved by the operator Ic 
acting on Vc = U @ U as multiplication by +i on the first summand and by —i on 
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the second. This is straightforward: 


@c (uy + V1, U2 + V2) = Wc (U1, V2) + @c(V1, U2) 


= we (iu, —i02) + wc(—i0}1, iu2) = wcUc(u1 + 01), Ic(u2 + V2)). 


18.6.2 Siegel Upper Half-Space and Riemann Relations 


Consider the coordinate space V = R” equipped with the standard symplectic 
form!? w, whose Gramian in the standard basis of R2” is 


Let us write e},e5,...,é,, e/,e5,...,e” for this basis. Then both coordinate 
subspaces V’ and V” spanned by the e’ and e’’ are Lagrangian, and V = V’ @ V". 
The complexified space Vc = C” has complexified splitting Vc = Vi © Vé, where 
Vc. Vé are Lagrangian for the complexified form wc. 

By Proposition 18.5, the Kahler triples (J, g,w) completing w to a Hermitian 
structure on V are in bijection with the decompositions Ve = U @ U such that 
both U and U are Lagrangian for wc, and im is restricted to the positive form on 
U. For every such decomposition, the first summand U has zero intersection with 
Vé, because V” is Lagrangian for @ = wcly, and for all v//,vy € V”, we have 
iwy(vi + ivy, vi + ivy) = 0. Therefore, the projection of U onto Vj, along Vi 
gives an isomorphism of complex vector spaces U = Vj. In particular, there exists 
a unique basis w = (w1,W2,...,W») in U projected to e’ = (e|,¢5,...,e,) along 


Eat (3 


Vé. In matrix notation, this basis is expressed in terms of the standard basis of Vc as 


E 
(Wi, Wa, 22 ,Wn) = (Cf, eee Ope CLs oe ( ) , (18.19) 


where S € Mat, (C). Note that the subspace U is uniquely determined by the matrix 
S. The Gramians of the restricted forms wc|y and iwy|y in the basis w = e’ + e”S 
are equal to 


(es)-(2.6) (‘) =S-—S' and (es)-(2.6) (5) = i(S—S') 


'2See Example 16.3 on p. 393. 


18.6 Hermitian Enhancement of Symplectic Structure 477 


respectively. Thus, a complex subspace U C Vc is Lagrangian if and only if the 
matrix S is symmetric. In this case, i (S — S') = Im S, and therefore the positivity of 
iw@y on U is equivalent to the positivity of the real symmetric matrix Im $. We come 
to the following theorem. 


Theorem 18.1 The Kdhler triples (I, g,@) completing the standard symplectic 
structure w in IR" to a Hermitian structure are in bijection with the symmetric 
complex matrices S € Mat,(C) having positive imaginary part, i.e., satisfying the 
conditions 


Se€Mat,(C), S'=S, VxeR"\0, x-(ImS)-x'>0. (18.20) 


The complex structure Is : R7" — R*" corresponding to such a matrix S = X + iY, 
X,Y € Mat, (IR), has block matrix 


-Y'x y! 
Is = 18:21 
fe oo) wee 


in every symplectic basis of R?". 


Proof Only formula (18.21) remains to be verified. By Proposition 18.3, the 
complex structure J : V > V coming from the decomposition Ve = U @ U sends 
the vector v = Rew € Vto/(v) = Re(iw) for every w € W. Ifw = e’+e”-(X+iY), 
then Re(w) = e’ + e” -X and Re(iw) = —e” - Y. Therefore, 


I(e”) = 1(Re(—iw- Y~')) = Re(w)- ¥"! = e’- ¥ 1+ e"- xy}, 
I(e’) = I(Re(w) — e” - X) = Re(iw) — I(e”) -X 


=-e-Y'X+e"-(Y+xXyY'X). 


oO 


Remark 18.1 (Notation and Terminology) The locus of all complex matrices satis- 
fying (18.20) is called the Siegel upper half-space and is denoted by ,,. The name 
goes back to the case n = 1, for which Mat,;(C) = C and §; C C is exactly 
the upper half-plane Imz > 0. The constraints (18.20) are known as the Riemann 
relations. They are to be found in several branches of mathematics that appear to 
be quite far from each other. For example, given a Z-module A ~ Z?" spanned by 
the standard n basis vectors in C” and n columns of a matrix S € Mat,(C), then 
the torus Ty = C”/A admits an analytic embedding T, <> P% that identifies T, 
with an algebraic projective variety if and only if the matrix S satisfies the Riemann 
relations. !* 


'3For details, see Tata Lectures on Theta I, by D. Mumford [Mu] and Algebraic Curves, Algebraic 
Manifolds and Schemes, by V. 1. Danilov, V. V. Shokurov [DS]. 
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Problems for Independent Solution to Chap. 18 


Problem 18.1 (Conjugate Complex Structure) For a complex vector space W, 
write W for the same abelian group of vectors as W but equipped with another 
multiplication by complex numbers defined by z>w “ Z- w. Show that: (a) W 
is a vector space over C, (b) dime W = dimcW, (c) there is natural C-linear 
isomorphism W @ W ~ (Wr)c, where the right-hand space is the complexified 
realification of W. 


Problem 18.2 Let W be a complex vector space and F : W — W a C-linear 
operator. Write Fc : (Wr)c > (Wr) for the complexification of the R-linear 
operator F : We — Wp provided by F on the realified vector space Wg. How 
are the characteristic polynomials, eigenvalues, and eigenvectors of F and Fc 
related?!4 


Problem 18.3 Show that for every vector space V over the field IR, the map 
C @ Endp(V) > Endc(C ®V), FHiGH Fc +iGc, 


is a well-defined C-linear isomorphism of complex vector spaces. 


Problem 18.4 (Hermitian Adjoint Operators) Let a complex vector space W be 
equipped with the Hermitian inner product (u, w). Show that for every operator 
F € Endc(W), there exists a unique operator F' € Endc(W) such that (u, Fw) = 
(Ftu,w) and (u, F'w) = (Fu, w) for all u,w € W. Check that (FG)' = G'Ft 
and the map o : F +> F' provides the complex vector space Endc(W) with a real 
structure. 

Problem 18.5 Make a direct computation checking that the matrix Js from for- 
mula (18.21) on p.477 has I = —E and preserves the skew-symmetric form w. 


Problem 18.6 Construct an isomorphism of groups U,  O2n(R) Spo, (R). 


Problem 18.7 For the realification Wg of the complex vector space W, write J : 
Wr — We for the multiplication-by-i operator w +> iw and put 
Endc(Wp) “ {F € Endp(Wp) | FI = IF}, 
Endg(Wr) = {F € Endg(Wr) | FI = —IF}. 
Show that Endp(We) = Endc(We) © Endg(We) and that every F € Endp(Wp) 
can be decomposed as F = Cr + Ap with Cr = (F — IFI)/2 © Endc(Wa), 


Ar = (F + IFI)/2 € Endg(Wp). Also show that the map F +> IF provides both 
spaces Endc (Wg) and Endz(Wr) with complex structures in which they become 


'4Note that Fc acts on a space of twice the dimension of that on which F acts. 


18.6 Hermitian Enhancement of Symplectic Structure 479 


a pair of conjugate complex vector spaces in the sense of Problem 18.1, that is, 
Endg(Wr) = Endc(Wa). 

Problem 18.8 Under the conditions and notation from Problem 18.7, assume that 
W is equipped with a Hermitian inner product (u, w) and put w(u, w) = Im(u, w). 
Check that for every F € Sp,,(Wr), its C-linear component Cr € GLc(Wr) C 
Endc(Wa) is invertible and therefore can be written as F = Cr(Idw + Zp), 
where Zp = Cr'Ar. Then prove that (a) Zr € Endz(Wr), (b) Zp-1 = 
—CrZpCz', (©) Cr = Ch, @) Fo! = (1— ZC, © Cr(l — Z2)C). = 
Idy, (f) (u.Zew) = (w,Zpu) for all uw € W, (g) (w,(L—Zz)w) > 0 


for all nonzero w € W, (h) Crp, = Cr, (Idw —25,Zp-1) Cr, and Zp,p, = 


-1 -1 
Cr, (Idy —Zx,Zp1) (Ze, —Z,.1) Cr, for any F, F2 € Sp,,(Wr). 


Problem 18.9 Under the conditions and notation from Problem 18.8, write B(W) 
for the set of all pairs (C,Z) € GLc(W) x Endg(We) such that C(1 — Z)Ci = 
Idy, (u, Zw) = (w, Zu), for all u, w € W, and (w, (1 — Z*)w) > 0 for all nonzero 
w € W. Show that the mappings F + (Cr, Zr) and (C,Z) t C(I + Z) assign 
the two inverse bijections Sp,,(Wr) 2 B(W). 


Problem 18.10 Show that the Siegel upper half-space 5, C Mat,(C) is con- 
tractable.!> 


'5That is, there exists a continuous map y : Hn X [0,1] + Hy whose restrictions to §, x {0} and 
to 9, x {1} are, respectively, the identity map Id,,, and a constant map that sends the whole 5}, to 
some point. 


Chapter 19 
Hermitian Spaces 


19.1 Hermitian Geometry 


Recall! that a vector space W over the field C is called Hermitian if for any two 
vectors u,w € W, the R-bilinear Hermitian inner product (u,w) € C is defined 
such that 


(u,w) = (w,u), (zu,w) = z(u,w) = (u,zw), and (w,w) > 0forallw 40. 


It provides every vector w € W with a Hermitian norm ||w|| = //(w, w) € Rso. 
Since 


(u + w,u + w) = lull? + |lw]]? + 2 Re(u, w), 
(u+iw,u+iw) = al + \|w||? — 2i Im(u, w), 


the Hermitian inner product is uniquely recovered from the norm function and the 
multiplication-by-i operator as 


2(wW1, W2) = \|w + w||? = \|w + iwa]|? . (19.1) 


Note that this agrees with the general ideology of Kahler triples from Sect. 18.5.2 
on p.471. 


'See Definition 18.1 on p. 469. 


?Or just length of the vector w; we use the double vertical bar notation to prevent confusion with 
the absolute value |z| of a complex number. 
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19.1.1 Gramians 


All the machinery of Gram matrices developed in Sect. 16.1.2 on p.388 can 
be perfectly applied to Hermitian inner products considered as R-bilinear forms. 
Since the Hermitian product is conjugate-symmetric, for every collection w = 
(W1,W2,...,Wm) of vectors w; € W, the Gramian of this collection Gy = ((wi, wj)) 
is a Hermitian matrix, that is, it satisfies 


G', = Gy. 


If one collection of vectors is linearly expressed through another collection as w = 
vCyy by means of a complex transition matrix C,y, then the Gramians of those 
collections are related by 


Gy= CC. Gy= Cps (19.2) 


because the Hermitian product is C-antilinear in the second argument. 


Exercise 19.1 Verify formula (19.2). 


19.1.2. Gram—Schmidt Orthogonalization Procedure 


Let W be a Hermitian space and u = (u),u2,...,Um) any collection of vectors 
uj € W. Exactly as in Euclidean space,* there exists an orthonormal? basis 
e = (é€1,@2,...,@x) in the linear span of the vectors u such that the transition 
matrix® Cy, = (ci) is upper triangular, i.e., has cj = O for alli > j. An 
orthonormal basis e is constructed by the same Gram—Schmidt recursive procedure 
as in Proposition 10.1 on p. 231. 


Exercise 19.2 Verify that it works perfectly. 


Lemma 19.1 For every collection of vectors w = (W1,W2,...;Wm), the Gram 
determinant Ty = detGy is a real nonnegative number vanishing if and only if 
the vectors are linearly related. 


Proof Letw = eCwy, where e = (e),é2,...,€,) is an orthonormal basis in the 
linear span of w. Then Gy = Ci, Cow. If n < m, then rk Gy < rk Cpy <n < m. This 


3See Example 18.2 on p. 466. 
4See Proposition 10.1 on p.231. 
5That is, with Gramian G, = E. 


Recall that the jth column of Cy, is formed by the coefficients of the linear expansion of the vector 
e; in terms of the vectors uw. 
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forces detGy = 0. Ifn = m, then detG, = det Coy - det Cey = | det Cul is real 
and positive. Oo 


19.1.3 Cauchy-—Schwarz Inequality 


For a collection of two vectors u, w, the previous lemma implies the inequality 


det (He) oP) — Jule? = aw) To) 2 0. 


(w, u) (w, w) 


which is an equality if and only if the vectors are proportional. Usually it is written 
as 


|(u,w)| < [lull - Ill (19.3) 


and is called the Cauchy—Schwarz inequality or Cauchy—Bunyakovsky—Schwarz 
inequality. 


Corollary 19.1 (Triangle Inequality) ||x|| + ||w|| = ||u + wll for all u,w € W. 


Proof ||u + wll? = lull? + llwll? + 2]@ wy] < full? + [wel]? + lle - Iwi) = 
(Ilull + [lwll)”- o 


19.1.4 Unitary Group 


A C-linear operator F : W — W ona Hermitian space W is called unitary’ if 
|Fw|| = ||w|| for all w € W. Formula (19.1) implies that every unitary operator F 
preserves the inner product: 
(Fu, Fw) = (u,w) Vo,wew. 
This forces the matrix of F in any basis to be related to the Gramian of that basis by 
F'.G-F=G. (19.4) 
Computation of determinants leads to | det F| = 1. In particular, F is invertible and 
F'=G6 FG=(G)'FG. 
= 


In every orthonormal basis, this formula becomes F’ “la PF, 


70Or a Hermitian isometry. 
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The unitary operators on W form a group called the unitary group of the 
Hermitian space W and denoted by U(W). Passage to the matrices of operators in 
some orthonormal basis e), é2,...,@, assigns an isomorphism between U(W) and 
the group of unitary matrices 


— 


U, S{F € GL,(C)| Fo! =F}. 
As usual, its subgroup SU, = SL,(C)M Un = {F € Un | detF = 1} is called 


the special unitary group. In contrast with the Euclidean case, the determinant of a 
nonspecial unitary matrix can take any value on the unit circle 


U=keClk=h 


and does not break isometries into disjoint classes. In other words, there is no 
orientation in Hermitian geometry. 


19.1.5 Hermitian Volume 


Choose some orthonormal basis e = (e€1, €2,..., @,) in W as a unit-volume basis and 
define the Hermitian volume of the parallelepiped spanned by the vectors v = e Ce, 
as 


Vol(v;,2,...,Un) | detC]. 
Since the absolute value of the determinant for the transition matrix between 
orthonormal bases equals 1, the Hermitian volume does not depend on the choice of 
orthonormal basis e = (e), é2,..., én) in W. Almost the same computation as in the 
Euclidean case, 


Vol? (v1, ¥2,...,Un) = |det C., |? = det Cl, - det C,, = det Gy, 


shows that the squared Hermitian volume equals the Gram determinant. 


19.1.6 Hermitian Correlation 


The Hermitian correlation on a complex Hermitian vector space W takes a vector 
w € W to the C-linear form hw : W > C, v & (v,w), which depends C- 
antilinearly on w € W. This assigns the R-linear and C-antilinear map 


h:W>W*, wrehw=(*,w). (19.5) 
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Since hw(w) = (w, w) > 0, this map is injective. Since dim W = dim W%, it is a C- 
antilinear isomorphism. Moreover, the Hermitian correlation is symmetric, meaning 
that its dual map* h* : W** —> W* becomes h under the canonical identification 
w** ~ W. 


Exercise 19.3 Verify that the matrix of the Hermitian correlation in any dual bases 
e,e* of W, W* coincides with the Gramian G, of the basis e in the sense that h takes 
the vector w = e- x, where x € C” is a column of coordinates, to the covector e* - y, 
where y = G, -X. 


For a basis w = (w1,W2,...,Wn) of W, the preimage wY = h 'w* of the 
basis w* in W* dual to w is called the Hermitian dual basis to w. The basis 
wY = (wy, wy,...,w,’) is uniquely determined by the orthogonality relations 

i; ey 
(wi, WY) = ori =J 
0 fori ¥~j, 


— 


and is expressed in terms of waswY =wG,, . 


19.1.7 Orthogonal Projections 


Given a subspace U C W in Hermitian space W, the subspace 
Ut = {we W|VueU (u,w) = 0} 


is called the orthogonal complement? to U. The positivity of the inner product 
implies the transversality UM U+ = 0. Since dimU+ = dimh(U+) = 
dim AnnU = dim W — dimU, we conclude that W = U @ Ut. The projection 
my : W —> U along U* is called the orthogonal projection onto U. Its action on an 
arbitrary vector w € W is described in the same way as in Euclidean space. 


Proposition 19.1 For every w € W, there exists a unique wy € U with the 
following equivalent properties: 


(1) w—wy € UL, 


(2) (w,u) = (wy, u) Vue U, 
(3) ||w—vull < |lw-ul] Vu A wy € U. 


8It takes a C-linear form F : W* — C to the composition F © h : W > C and also is C-antilinear. 


? Or just the orthogonal. 
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For every pair of Hermitian dual bases uy, u2,...,ux and uv ,,u‘2,...,u%, in U, 
k 
yw =wy = poe (w, uy’) + uj. (19.6) 
i=1 
Proof Completely the same as the proof of Theorem 10.1 on p. 237. oO 


Exercise 19.4 Check this yourself. 


19.1.8 Angle Between Two Lines 


Recall that in the Euclidean plane, the angle g = & (u,w) between two nonzero 
vectors u, w is defined by 


cos ~g = ————_.. (19.7) 


In a Hermitian space, the inner product (u,w) on the right-hand side becomes 
complex. This problem is circumvented by taking its absolute value. Let us 
introduce w by 


|(u. w)| 
llall - Iw 


In Euclidean geometry, such w equals the previous g for acute g, but YW = 2 — 
for obtuse g. Thus, w is the smaller of two contiguous angles between intersecting 
lines spanned by the vectors rather than the angle between the vectors themselves. 
In Hermitian geometry, w has the same qualitative sense. However, the geometric 
environment becomes a bit more complicated. In a complex vector space, the vectors 
u, w span the 2-dimensional complex plane C?, whose realification is R*. In real 
terms, the complex lines C - u and C - w are two transversal real planes 1, ~ R? 
and I,, ~ R? within R*. Note that they do not break R* into disjoint pieces. In 
Euclidean geometry, the unit direction vectors e, = u/||u||, ey = w/||w|| on each 
line are unique up to a sign. In the Hermitian case, these unit vectors can be chosen 
arbitrarily on two nonintersecting unit circles 11, 9 S?, 1, S?, which are cut out 
of the unit 3-sphere S* = {v € R* | ||u|| = 1} by transversal planes IT, T,,. Every 
pair of such unit vectors e, € Tl, 9 S?, e, € My NS? is joined by a short arc of 
an equatorial unit circle cut out of S? by the real plane spanned by e,, e,,. Standard 
compactness arguments show that there is some shortest arc among them. The length 
of this shortest arc is called the angle between the complex lines Cu, Cw in the Her- 
mitian space W. Let us show that it is equal to that y introduced in formula (19.8). 
Write the Hermitian inner product in R* = C? as 


cosy = 


= | (u/llull .w/llwi)) |. (19.8) 


(v1, V2) = g(v1, V2) + i@(v4, v2), 
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where g(v;, U2) = Re(vj, v2) is the Euclidean structure uniquely predicted by the 
equality g(v,v) = |lv||? for all v. As e,, ey run through unit circles, the sum of 
squares g7(€,, €y) + 7 (ey, ew) = |(€u, ew)|? is fixed, because the phase shifts 


Cy > rey, = ey FH Ley 


with A, € C, JA] = |u| = 1 do not change |(e,, e,,)|. The minimal Euclidean 
angle X (e,, €) corresponds to the maximal cos? (e,, €y) = g7(€y, @w), that is, to 
the minimal w*(v, w). The latter equals 0 and is attained, because the 3-dimensional 
o-orthogonal 


a 4 
e, ={v ER" | a(en,v) = 0} 
has nonzero intersection with the plane IT,, in R*. Thus, 
l(€u, @w)| = max cos 4 (€y, ey). 


Note that the Cauchy—Schwarz inequality forces the right-hand side of (19.8) to 
lie in the segment [0, 1] Therefore, the angle between any two complex lines in a 
Hermitian space is in the range 0 < y < 7/2. 


19.2 Adjoint Linear Maps 
19.2.1 Hermitian Adjunction 


For every C-linear map of Hermitian spaces F : U — W, there exists a Hermitian 
adjoint map F* : W —> U defined by means of the commutative diagram 


w* F* U* 
mw] fro ie, hgk! =F hy, 
+ 


w—_ su (19.9) 


where hy, hw are Hermitian correlations on U, W, and F* : W* > U*,wreowoeF, 
is the dual map of F. Equivalently, Ft : W — U is the unique C-linear map such 
that 


YueU,VweW, (Fu,w) = (u,F'w). (19.10) 


Exercise 19.5 Verify the equivalence of the conditions (19.9) and (19.10). 


If we conjugate both sides in (19.10) and use the conjugate symmetry of the 
Hermitian inner product, (v1, v2) = (v2, v;,), then (19.10) becomes the equivalent 
requirement 


YueU,VwewW, (F'w,u) = (wv, Fu). (19.11) 
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In terms of matrices, the relations (19.9)-(19.11) mean that the matrix Fin of the 
operator F Tin any bases u, w of U, W is related to the matrix F,,, of the operator F 
and Gramians G,, Gy of the bases by” 

Fi, =G;,'Ft, G (19.12) 


uw ~W* 


For orthonormal bases u, w, this equation is simplified to F? = F. In particular, 
Fit =F, 


Exercise 19.6 Verify that the map F +> F" is C-antilinear and (FG)' = GF". 


19.2.2. Adjoint Endomorphisms 


For U = W, the Hermitian adjunction of operators is a C-antilinear involution 
Endc(W) — Endc(W), F + F*; that is, it provides the complex vector space 
Endc(W) with a real structure.'' If we write the operators as matrices in some 
orthonormal basis of W, then this real structure becomes that considered in 
Example 18.2 on p.466. For an arbitrary basis w in W, formula (19.12) says that 
Fi = G,'F! Gy. 

Real and pure imaginary operators with respect to Hermitian adjunction are 
called self-adjoint'* and anti-self-adjoint'’ respectively. They form real'* vector 
subspaces 


End (W) = {F € Endc(W) | F! = F}, 
Ende (W) = {F € Endc(W) | Fi = —F}, 


and the realification Endc(W)p = End¢ (W) ® End¢ (W) as a vector space over R. 
An arbitrary operator can be uniquely decomposed into self-adjoint and anti-self- 
adjoint components as 


F=F,4+F_ 
where 


F+ Ft F-F* 
€End{(W), F_= 


F.= € Endo (W). 


‘OTaking into account that G, = G!, and Gy = G',. 
See Sect. 18.3 on p. 466. 

Or Hermitian. 

30r anti-Hermitian. 

'4That is, over the field R. 
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Multiplication by i and —i assigns the inverse R-linear isomorphisms 


Fy if, 
——___—_— 


Endi(w) Endo(W) . 


—iF_a F_ 


In an orthonormal basis of W, a self-adjoint operator F has a conjugate symmetric 
matrix F’ = F, whereas an anti-self-adjoint operator has a conjugate antisymmetric 
matrix F’ = —F. 


Exercise 19.7 Verify that the unitary operators are exactly those invertible opera- 
tors whose inverses coincide with the Hermitian adjoint: 


Feu) <= Fi=F". 


19.2.3 Euclidean Adjunction 


For a real Euclidean vector space V, we have defined!» the adjunction of operators 
F : V > V by means of the symmetric correlation g : V + V* provided by the 
Euclidean structure. It maps F +> g~'F*g and is uniquely determined by 


Vu,,v2€V (Fv, v2) = (v1, Fur). 


In terms of matrices, FY = G7! - F'- G, where G is the Gramian of the Euclidean 
inner product in the same basis in which the matrix of F is taken. Euclidean 
adjunction is an involutive antiautomorphism of the R-algebra Endg(V). It also 
leads to a splitting 


Endp(V) = End? (V) ® Endg(V), where Endz(V) = {F | FY = +F}, 


where each F € Endg(V) can be decomposed as 


F=F,+F_,Fi =(F+FY)/2 € Endd(V). 

This picture agrees with Hermitian adjunction under complexification. If we 
equip the complexified space Ve = C ® V with the Hermitian product (u, w)y 
provided by the C-sesquilinear extension of the Euclidean product in V, 

(uy + iuz, wy + iwz)ne S ((u1, w1) + (u2, w2)) + i((u2, Wi) — (1, W2)), (19.13) 
then every Euclidean orthonormal basis of V over R is simultaneously an orthonor- 


mal basis for the Hermitian space Vc = C over C. For every R-linear operator 
F : V > V and its complexification Fc : Vc — Ve, the relation (FY)c = (Fc)! 


See Sect. 16.5.3 on p.411. 
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holds, because both parts have the same matrix F’ in every real orthonormal 
basis of V. Therefore, the Euclidean (anti) self-adjoint and orthogonal operators 
on V, which respectively satisfy FY = +F and FY = F7!, are complexified to 
Hermitian ae) self-adjoint and unitary operators on Ve, which satisfy F' = +F 
and Ft = F- 

Equivalently, we can say that Hermitian adjunction F +> Ft is nothing but the 
C-antilinear extension of Euclidean adjunction F +> FY from Endg(V) onto its 
complexification!® 


C ® Endg(V) ~ Ende (Vc). 


Example 19.1 (Adjoint Linear Differential Operators) Write V for the space of 
infinitely differentiable functions f : [a,b] — R vanishing together with all 
derivatives at the endpoints a, b. Introduce a Euclidean inner product on V by the 
prescription 


b 
Gaz i Flig(t) dt. 


Then the differentiation operator d/dt : f +> f’ is anti-self-adjoint, as integration by 
parts shows: 


(41.2) =f rea=— [pa (s.-4.), 


For every f € V, the multiplication-by-f operator g +> fg is clearly self-adjoint. 
Keeping in mind that adjunction reverses the composition, we can compute the 
adjoint operator to any linear differential operator on V. For example, the adjoint 


operator to L = £ f(t) Pf" (0 is given by 


fre (Ff)" = of + orf +f", 


Le, LY = Po cm 7 +60? & d ; +6t. For a generic aaa order operator L : f > af” +bf+c, 
where a, b,c € V, we get similarly LY = as —(b- 2a') 4 + (c—b! +a"). 


‘6Compare with Problem 18.3 on p. 478. 
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19.3 Normal Operators 


19.3.1 Orthogonal Diagonalization 


An operator F on a Hermitian space W is called normal if it commutes with its 
adjoint, i.e., F'F = FF*. For example, all (anti) self-adjoint operators, which have 
F' = +F, and all unitary operators, which have Ft = F~!, are normal. 


Theorem 19.1 An operator F ona Hermitian space W is normal if and only if it can 
be diagonalized in some orthonormal basis of W. In this case, up to permutations of 
diagonal elements, the diagonal matrix of F does not depend on the choice of such 
an orthonormal basis. 


Proof Let the matrix of F in some orthonormal basis be diagonal. Then the matrix 
of the adjoint operator F" is also diagonal in this basis and therefore commutes with 
the matrix of F. Since diagonal elements A of every diagonal matrix of F are in 
bijection with the elementary divisors!’ (t — 4) € E€(F), they do not depend on the 
choice of basis in which F has a diagonal matrix. 
Conversely, let the operator F : W — W be normal. For dim W = 1 or scalar 
F = Aldy, the operator F is diagonal in every orthonormal basis. For dimW > 1 
and nonscalar F’, we use induction on dim W. Since the field C is algebraically 
closed, a nonscalar operator F has a proper nonzero eigenspace U © W. Then 
W = U@U-. Since F* commutes with F, it sends the eigenspace U to itself 
by Sect. 15.3.3. Therefore, for every w € U+ and all u € U, we have (Fw, u) = 
(w, Ftv) = 0. This means that U+ is F-invariant. By the inductive hypothesis, F| yl 
has a diagonal matrix in some orthonormal basis of U+. We attach to this basis any 
orthonormal basis of U and get an orthonormal basis for W in which F is diagonal. 
Oo 


Corollary 19.2 A linear operator on a Hermitian space is self-adjoint if and only 
if it has a real spectrum and can be diagonalized in some orthonormal basis. 


Corollary 19.3 A linear operator on a Hermitian space is anti-self-adjoint if 
and only if it has a pure imaginary spectrum and can be diagonalized in some 
orthonormal basis. 


Corollary 19.4 A linear operator F on a Hermitian space is unitary if and only if 
Spec FC U1(C) = {ze C: |zZ|/= 1} 


and F can be diagonalized in some orthonormal basis. 


"Equivalently, we could say that each A € Spec F appears on the diagonal exactly dim W, times, 
where W) C W is the A-eigenspace of F. 
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Exercise 19.8 Verify that the unitary group U,, is a compact path-connected subset 
in Mat, (C). 


19.3.2 Normal Operators in Euclidean Space 


An operator F on a real Euclidean space V is called normal if it commutes with its 
Euclidean adjoint, ic., FY -F = F- FY. As we have seen in Sect. 19.2.3, this is 
equivalent to the normality of the complexified operator Fic on the complexified 
space W = Vc equipped with the Hermitian structure (19.13) that extends the 
Euclidean structure by sesquilinearity. 


Proposition 19.2. Every self-adjoint operator F on a real Euclidean space V can 
be diagonalized in some orthonormal basis. Up to permutation of the diagonal 
elements, the result of such a diagonalization does not depend on the choice of 
orthonormal basis. 


Proof Write W, = {w € Vc | Few = Aw} and Vy = {v € V | Fu = Av} 
for the A-eigenspaces of Fc and F in Vc and V respectively. By Corollary 19.2, 
Spec Fc = Spec F is real and Vc splits into a direct orthogonal sum: 


Vex QB Wy. (19.14) 


A€Spec Fc 


In Sect. 18.2.4 on p. 464, we have seen that W, = C ® V, for every A € Spec Fc = 
Spec F. Since the complexification of GB Vy is the whole of Vc, we conclude that 
@ V), exhausts the whole of V. Oo 


Proposition 19.3. Every anti-self-adjoint operator F on a real Euclidean space V 
can be written in an appropriate orthonormal basis of V as a block diagonal matrix 


A 0 


Ap 0 a 


, where Ay = 
wnere Ax fo 


) and a,€ER. 

0 Ax 
Up to permutation of blocks, the real numbers a, do not depend on the choice of 
such an orthonormal basis. 


Proof 1n the notation introduced during the proof of Proposition 19.2, we again 
have an orthogonal decomposition (19.14), but now all eigenvalues A € Spec F¢ are 
pure imaginary. Since the characteristic polynomial y7,. = yr has real coefficients, 
all the eigenvalues of Fc split into conjugate pairs tia, a € R. We have seen in 
Sect. 18.2.4 that for every such pair, Wi, @ W_ig = C ® U is the complexification of 
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some 2-dimensional F-invariant subspace U C V on which the restricted operator 
F\y has matrix!® 
Oa 
—a0 


in every basis v;, v2 such that Fc(v; + iv2) = ia(v, + iv2). Then the computation 
(v1, 02) = (v1, Fv) = —(Fuj, 01) = —(v2, 01) = —(U1, v2) forces v1, v2 to be 
orthogonal. It remains to rescale them to unit lengths. oO 


Exercise 19.9 Deduce from Corollary 19.4 another independent proof of Theo- 
rem 15.2 on p.370 about the normal form of a Euclidean isometry. 


Exercise 19.10 Show that an operator F' on a real Euclidean space V is normal if 
and only if it can be written in an appropriate orthonormal basis of V as a block 
diagonal matrix that consists of arbitrary | x 1 blocks and 2 x 2 blocks of the form 


ab 

—ba) 
Up to permutation of blocks, this matrix does not depend on the choice of 
orthonormal basis. 


Example 19.2 (Euclidean Quadrics) In a Euclidean space V, the real affine 
quadrics listed in Sect. 17.5 acquire additional metric invariants preserved by 
orthogonal transformations of the ambient space. These invariants are called 
semiaxes of the Euclidean quadric and are constructed as follows. In Sect. 16.5.4 
on p.411, we have seen that there is a linear bijection between symmetric bilinear 
forms B : V x V > R and self-adjoint operators B : V — V. It is given by 
B(u, w) = (u, Bw) for all u,w € V. 


Exercise 19.11 Check that in every orthonormal basis of V, the matrix of the 
operator B coincides with the Gramian of the form f. 


Thus, by Proposition 19.2, for every quadratic form g € S*V*, there exists an 
orthonormal basis in V such that g(x) = a,x? +azx?7-+:+-+a,x’, in the coordinates 
related to this basis. The coefficients a; do not depend on the choice of such a 
basis, because they are equal to the eigenvalues of the unique self-adjoint operator 
Q: V — V such that g(u, w) = (u, Qw) for all u, w € V. 

We know from Sect. 17.5.2 on p.447 that the equation of a smooth central 
quadric in an affine coordinate system originating at the center of the quadric!? 


'8See formula (18.10) on p. 464. 
!°Which coincides with the pole of the infinite prime and the unique center of symmetry for the 
quadric. 
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looks like f,(x) = 1, where f; € S*V*. Passing to an orthonormal basis in which f, 
becomes diagonal, we conclude that every smooth central quadric in the Euclidean 
space R” is isometrically congruent to one and only one of the following: 

Gxt +a — bin, — ++ bx, = +1, (19.15) 
where a;,b; > 0,p > m,p+m = n, and forp = m = n/2, only +1 
is allowed on the right-hand side. These quadrics are obtained from those listed 
in formula (17.23) on p.447 by axial stretching with real positive magnification 
coefficients a), a2,...,dp, bi, b2,..., bm. These coefficients are called the semiaxes 
of the quadric. 

For paraboloids,”° the Euclidean structure allows us to indicate a canonical origin 
for the affine coordinate system as follows. In the notation of Sect. 17.5.3, write 
¢ = Hoo NQ for the unique point of Q at infinity and L = P(ct) C P(V) = Hoo for 
the polar of c with respect to the Euclidean scalar product. Then L is a projective 
subspace of codimension 2 in P(k @ V). The polar line”! of L with respect to Q is 
called the principal axis of the paraboloid. By construction, the principal axis passes 
through c and intersects the paraboloid at one more point within Up. This point is 
called the vertex of the paraboloid. In every affine coordinate system originating 
at the vertex and having the nth coordinate along the principal axis, the equation 
of the paraboloid is g(x) = xn, where g = q|,. is a nonsingular homogeneous 
quadratic form, the restriction of the extended quadratic form of the paraboloid on 
the Euclidean orthogonal complement to c in V. 


Exercise 19.12 Check this. 


Now we can choose an orthonormal basis in ct where the Gramian of G becomes 
diagonal and conclude that every paraboloid in the Euclidean space R” is isometri- 
cally congruent to one and only one of the following: 
ax, + +++ + ax — Bix — + — Bi = Xn (19.16) 

where a;, b; > 0, p = m, and p-+m = n—1. These quadrics are obtained from those 
listed in formula (17.25) on p. 449 by axial stretching with magnification coefficients 
aj, b; in the directions perpendicular to the principal axis. The constants aj, bj are 
also called semiaxes of the paraboloid. 

Cones and cylinders also inherit metric invariants coming from the semiaxes of 
the nondegenerate quadric that is a base for the cone or cylinder. 


See Sect. 17.5.3 on p. 448. 


2! That is, the locus of poles for all hyperplanes passing through L, or equivalently, the intersection 
of the polar hyperplanes of all points of L. 
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19.4.1 Polar Decomposition 


Every nonzero complex number z € GL;(C) can be factored in polar coordi- 


nates as z = Q-e!”, where g = |z| = Vz is real and positive, whereas 
e” = cos? + isin’ liesin Uj. A similar factorization can be done for every 


linear operator Z € GL,(C). The role of @ will be played by a self-adjoint operator 
S = VZZ* with positive spectrum. Then ZS~! € U,. The details are as follows. 


Lemma 19.2 Let F : U — W be an arbitrary C-linear operator between 
Hermitian spaces. Then both operators FF‘ € End(W), F'F € End(U) are self- 
adjoint and have nonnegative spectra. If the operator F is invertible, then the 
both spectra are strictly positive. Conversely, if FF‘ (respectively F'F) has positive 
spectrum, then F admits some right (respectively left) inverse operator. 


Proof The self-adjointness of both operators is obvious. By Corollary 19.2 on 
p.491, it forces their eigenvalues to be real. If FF'w = Aw 4 0 for some w € W, 
then Ftw # OandaA-(w,w) = (Aw,w) = (FFiw,w) = (F'w, Fw). Hence, 
A = (Ftw, Ftw)/(w, w) > 0. Similarly, if F'Fu = yu # 0, then Fu # 0 and 
ju = (Fu, Fu)/(u,u) > 0. Therefore, the nonzero elements of both spectra are 
positive. If the operator F is invertible, then F? = hylF *hw is invertible as well. 
Hence, both operators FF + F'F have zero kernels. Conversely, if ker FF + = 0, then 
(im F)+ = ker Ft = 0. 


Exercise 19.13 Check that ker F* = (im F)+. 


Thus, F is surjective and therefore invertible from the right. If ker F *F = 0, then 
F is injective and invertible from the left. oO 


Theorem 19.2 (Polar Decomposition) Every invertible C-linear operator F on 
a finite-dimensional Hermitian space admits unique factorizations F = S,I, and 
F = 1,8, where the operators U;, Uz are unitary, and the operators S,, Sz are self- 
adjoint with strictly positive eigenvalues. 


Proof Choose bases such that FF' and F'F are diagonal” and put S; = /FF', 
S, = VF'F as the diagonal operators obtained by extraction of the positive square 
roots from the diagonal elements of the corresponding matrices. Then S$), Sz are 
self-adjoint and have positive eigenvalues as well. By construction, S; commutes 
with FF and has s = FF’, Similarly, Sy commutes with F'F and has - = F'F. 
This forces the operators J; = Sy F and lh = Fs;! to be unitary, (ju,Jjw) = 
(S,'Fu, S;1Fw) = (F'S;?Fu,w) = (F'(FFt)"!Fv,w) = (u,w), and similarly, 
(hu, hw) = (FSy!u, FSy!w) = (u,Sy'F'FSy!w) = (v, F'FSy?w) = (u,w). 
Thus, we have obtained the existence of polar decompositions. Let us show that 


?2Such bases may be different for FF‘ and F'F in general. 
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the factorization F = S,I;, where 1; € U(W) and S;, is self-adjoint positive, 
is unique. Since fal = m, we have F'F = S?. Therefore, $; commutes with 
FF". By Proposition 15.8, commuting diagonalizable operators S, and FF’ can 
be diagonalized simultaneously in some common basis. Then the action of S; on 
each j1-eigenspace of FF’ is given by some diagonal matrix whose squared is jE. 
Since all eigenvalues of S; are positive, S$; must act by the scalar matrix ./[E. 
Since this completely describes the action of S, on the whole space, S; is uniquely 
determined by F. Therefore, ) = Fsy} is unique too. The arguments for F = 15S. 
are completely symmetric, and we leave them to the reader. Oo 


Exercise 19.14 Prove that every invertible R-linear operator F on a real Euclidean 
space can be uniquely factored as F = $,J, and as F = 1[,S>, where the operators 
I, 2 are orthogonal and the operators S; 2 are self-adjoint and have strictly positive 
eigenvalues. 


19.4.2 Exponential Cover of the Unitary Group 


The algebra of power series absolutely convergent in all of C is suitable for 
evaluation” at every operator F : C” — C". In particular, the exponent e” is well 
defined for every operator F. If F is anti-self-adjoint with respect to the standard 
Hermitian structure on C”, then €€(F) consists of n linear binomials of the form 
t—ia,a € R. By Proposition 15.9 on p. 382, E¢(e") consists of linear binomials —e”, 
which are in bijection with the elements of €£(F). We conclude that e” : C” > C” 
is a unitary operator with eigenvalues e“ = cosa + isina that are in bijection” 
with the eigenvalues ia of F. Moreover, if we decompose C” as a direct sum of 1- 
dimensional F-invariant subspaces, then they are e’-invariant as well, and each ia- 
eigenvector of F is an e“*-eigenvector of e”. Since every unitary operator can be 
obtained in this way, we conclude that the exponential map 


Ende (C") > Un, Fre’, (19.17) 


is surjective. Therefore, every unitary operator can be written as J] = e'” for some 
self-adjoint operator T. Thus, every F € GL,(C) can be factored as F = Se'”, where 
both S and T are self-adjoint. 


Caution 19.1 In contrast with Theorem 19.2, the factorization F = Se” is not 
unique, because the exponential map is not injective. For example, e774 = Id. 
Moreover, the exponential map (19.17) is not a homomorphism, in the sense that 
eAtB & eAe® for noncommuting A, B. Instead of such a simple formula, there is a 


3See Sect. 15.4 on p. 379. 
*4Counting multiplicities. 
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rather complicated infinite expansion” of e4*? in terms of the iterated commutators 
constructed from A, B. 


19.4.3 Singular Value Decomposition 


The singular value decomposition is a weak bivariant version of the polar decom- 
position. Let us say that a rectangular matrix A = (aj) is diagonal if aj = 0 for all 
iFj. 

Theorem 19.3 For every C-linear map of Hermitian spaces F : U — W, there 
exist orthonormal bases of U, W such that the matrix of F in these bases is diagonal 
with real nonnegative diagonal elements. Up to permutations, the diagonal elements 
do not depend on the choice of such orthonormal bases. 


Proof Since the endomorphism F'F : U — U is self-adjoint, there exists an 
orthonormal basis e),¢2,...,é@, in U formed by the eigenvectors of F TF, By 
Lemma 19.2, the spectrum of F TF is real and nonnegative. Therefore, F Fe; = are; 
for some real w; > 0. Let us renumber the basis vectors in order to have a; 4 0 for 
1 <i<randa; = 0 for alli > r. Then fori > r, the vector e; is in ker F, because 
Fe; is orthogonal to imF: Vu € U, (Fe;, Fu) = (F'Fe;,u) = (0,u) = 0. At the 
same time, for 1 < i < r, the vectors F(u;) form an orthogonal system in W: 


<i=j 
LiF ij 


iz 


a? >0 for 


(Fe;, Fe;) = (F' Fe;, e;) = a? (ei, ej) = 
0 for 


1 
1 


~ 


Hence, the vectors f; = Fe;/a;, 1 < i < r, form an orthonormal basis in im F. 
Exercise 19.15 Verify that f,,/5,...,f span im F. 


Let us include the vectors f; in an orthonormal basis f for W by attaching some 
orthonormal basis of (im F)+ to them. Then the matrix F, fe of the operator F in the 
bases e, f has the required diagonal form. Given another pair of orthonormal bases 
in which F has a diagonal matrix with r nonzero diagonal elements a1, @2,...,@;, 
then r = rk F is base-independent, and the operator F'F has the matrix A'A = A‘A, 
which is diagonal with elements «? on the diagonal. Thus, the a; are the nonnegative 
square roots of the eigenvalues of F'F and therefore also do not depend on the 
choice of bases. Oo 


Exercise 19.16 Prove that for every invertible R-linear map of Euclidean spaces 
F : U — W, there exist orthonormal bases in U, W such that the matrix of F in 


It is known as the Campbell-Hausdorf series and can be found in every solid textbook on 
Lie algebras, e.g., Lie Groups and Lie Algebras: 1964 Lectures Given at Harvard University, by 
J.-P. Serre [Se]. 
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these bases is diagonal with nonnegative diagonal elements, which do not depend, 
up to permutation, on the choice of such orthonormal bases. 


Corollary 19.5 (SVD: Singular Value Decomposition) Every rectangular com- 
plex matrix F € Matnxn,(C) (respectively real matrix F € Matmx,(R)) can be 
factored as F = T,,DT,, where D is a diagonal m X n matrix with real nonnegative 
diagonal elements, and Tm © Um Tn € Un (respectively Tin € Om, Tn € On). The 
diagonal matrix D does not depend on the choice of factorization. 


Definition 19.1 (Singular Values) Given a C-linear map of Hermitian spaces 
(respectively R-linear map of Euclidean spaces) F : U — W, the diagonal elements 
a; of its diagonal matrix from Theorem 19.3 (respectively from Exercise 19.16) are 
called singular values of F. Any zero diagonal elements that may exist are also 
included. Given a real or complex rectangular matrix F, the diagonal elements of 
the matrix D from Corollary 19.5 are called singular values of the matrix F’. 


Remark 19.1 Geometrically, Theorem 19.3 and Exercise 19.16 say that every linear 
map F between real or complex vector spaces equipped with positive inner 
products can be factored as a dilatation along perpendicular directions (with the 
direction depending on the stretch coefficients) possibly preceded by the orthogonal 
projection along the kernel (if ker F # 0). Then the nonzero singular values are 
the stretch coefficients, and the total number of zero singular values is equal to 
dim ker F. 


Example 19.3 (Euclidean Angles Between Subspaces) Let U, W be two vector 


subspaces in a real Euclidean vector space, dimU = n < m = dimW. Write 
x : U — W for the projection along W+ anda, > a2 > ++: > a, for the singular 
values of z. Since |zu| = |u|-cos 4 (stu, u), the numbers a; = cos ¢; are the cosines 


of increasing angles 


0< 9 <<} SO, <a/2, G=AWiU), (19.18) 
between the first n vectors of some orthonormal basis w},W2,...,Wm of W and 
vectors U1, U2,...,U, forming an orthonormal basis of U. By Exercise 19.16, they 


do not depend on the choice of orthonormal basis in U projected to an orthogonal 
system of vectors in W. On the other hand, such an orthonormal basis can be 


constructed geometrically as follows. Choose any orthonormal basis u, v2, ... , Uj—1 
in UN W and put V; = (UN W)+, U; = UN V;, W; = WN V;. Then the angle 
4 (u, w) between the unit-length vectors u € U;, w € Wj, |u| = |w| = 1, achieves 


its minimal value at a pair of vectors u; € Uj, w; € W;. This is evident from the 
compactness of unit spheres and the continuity of cos & (u, w) = (u, w). However, 
a purely algebraic argument exists as well. 


Exercise 19.17 Given two vector subspaces U, W, dim U < dim W, in a Euclidean 
space, show that the maximum of cos & (u, w) taken over all nonzero u € U, w € W 
equals the maximal singular value of the orthogonal projection U > W. 
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Now attach u; and w; to the bases being constructed, write V;,; C V; for the 
orthogonal to the plane spanned by u;, w;, put Ui44 = U;N Vi4i1, Wind = Wi 
V,41, and proceed further by induction. We conclude that the (nonstrictly) increasing 
collection of minimal angles obtained in this way coincides with the singular values 
of the projection z and does not depend on possible ambiguities in the choice of u;, 
w; at each step. The angles (19.18) form a complete system of invariants for a pair 
of subspaces in Euclidean space. 


Exercise 19.18 Show that one pair of subspaces U’, W’ can be transformed to 
another pair of subspaces U”, W” by an orthogonal automorphism of the ambient 
Euclidean space if and only if dimU’ = dimU”, dimW’ = dimW’”, and the 
angles (19.18) for U’, W’ are the same as for U”, W”. 


For arbitrary orthonormal bases e = (e1, e2,...,e,) in U andf = (fi,f2,.--.fm) 
in W, the singular values of their reciprocal Gramian”° Gg = ((ei. f)) coincide 
with the cosines a; = cosqg; of the angles (19.18), because in passing to the 
orthonormal bases u, w constructed above, we get a singular value decomposition 
as in Corollary 19.5: Gg = C!eGuwCwr, where the Gramian G,y is diagonal, 
C1, = Ci) = Cou € O(U), Cue € O(W). 


ue 


Problems for Independent Solution to Chap. 19 


Problem 19.1 Give an explicit example of an operator F on a Hermitian space W 
and a proper F-invariant subspace U C W such that U*+ is not F-invariant. 


Problem 19.2 Prove that (ker F)+ = im F’. 


Problem 19.3 Let W = U, © U2, where the sum is not necessarily orthogonal. 
Write F : W —> U, for the projector along U2. Show that W = Ut @ uF and Ft 
projects W onto Uy along U ie 


Problem 19.4 (Harmonic Polynomials) Write U* for the 3-dimensional real 
vector space with basis x, y, z and equip SU* ~ R[x, y,z] with a Euclidean 
inner product such that all monomials x“y’z’ are mutually orthogonal with 
(x%yP ev, x*yP 2”) = @!B!y!. Describe the adjoint operator of the Laplace operator 


er = + s + 4 and show that the homogeneous degree-m polynomials 


can be decomposed as S’”U* = Hy, ® 0? - Hn—2 ® 0+ - Hm—4 ® +++, where 
Hin = {f € S"U* | Af = 0} is the subspace of harmonic homogeneous degree- 
m polynomials and 9? “ x? + y? + 2’. 


6See Sect. 10.2 on p. 233. 
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Problem 19.5 Let V be the space of smooth periodic functions R — R of period 
T > 0. Introduce a Euclidean scalar product on V by the prescription 


T 
2) = / fear. 


Describe the adjoint operator to the linear differential operator of the form 


dé di- 1 


d 
Og + II “tr Fay + ays 


dxk-1 dx 


where a; = a;(x) € V. Check whether the operator 
sg f ome ae _ 4nx\ d 
sa | —— |) Oe —]— 
dx? T } dx 
is self-adjoint. 


Problem 19.6 Take [a, b] = [0, 1] in Example 19.1 on p. 490 and check whether the 
operator 


(x — ee il 5 + 2x(x — 4 


is self-adjoint. 
Problem 19.7 (Schur’s Theorem) Prove that every C-linear operator on a Hermi- 
tian space has an upper triangular matrix in some orthonormal basis. 


Problem 19.8 Prove that for every normal operator F on a Hermitian space W: 


(a) Every two eigenvectors of F having different eigenvalues are orthogonal. 
(b) Every orthogonal collection of eigenvectors of F is included in some orthogonal 
basis of W formed by eigenvectors of F. 


Problem 19.9 (Criteria of Normality 1) Prove that each of the following condi- 
tions on a C-linear operator F on a Hermitian space is equivalent to the normality 
of F: (a) every eigenvector of F is an eigenvector for F* too, (b) ||Fw]|| = ||F*w|| 
for all w, (c) the orthogonal complement to every F-invariant subspace is F- 
invariant, (d) every F-invariant subspace is F + invariant, (e) the self-adjoint and 
anti-self-adjoint parts of F commute, (f) the self-adjoint and unitary factors in 
the polar decomposition of F commute. 

Problem 19.10 For every normal operator A and k € N, prove that the equation 

= A has a normal solution X. For which A can all solutions X be written as a 
polynomial in A? 


Problem 19.11 Prove that for every unitary operator U and k € N, the equation 
X* = U hasa unitary solution X that may be written as a polynomial in U. 
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Problem 19.12 Let F : C” — C" be a self-adjoint operator with respect to the 
standard Hermitian structure.”’ For every r-dimensional subspace L C C” with 
orthonormal basis e), e2,...,€,, put RL(F) = >, (Fe;, e;). (a) Prove that R(F) 
does not depend on the choice of orthonormal basis in L. (b) Assuming that F has 
distinct eigenvalues a, > @2 > --- > Qy, find max; R,(F) over all r-dimensional 
subspaces LC C”. 

Problem 19.13 (Courant—Fischer-Weyl Min—Max Principle) Let V be an n- 
dimensional Euclidean space and Q : V — V a self-adjoint operator with 
eigenvalues a} > a2 > -:: > a,. Write g(v) = (v, Qv) for the quadratic form 
corresponding to Q. For every subspace U C V, let my(q) and My(q) denote the 
minimal and the maximal values of q on the unit sphere {u € U: |u| = 1} in U. 
Prove that maXgim vax Mu(q) = Ox = MiNgim wan+1—-« Mw(q). 

Problem 19.14 Under the notation of Problem 19.13, let H C V be a hyperplane 
and q|#(v) = (v, Pv) for some self-adjoint operator on H with eigenvalues 


Bi > Bo > --- = Bri. 


Show that a; > B; > a > Br > +++ S Qy-1 S Br-1 S Oy. 


Problem 19.15 Show that the exponential map K +> e takes real skew-symmetric 
matrices to special orthogonal matrices. Is the resulting map 


End, (R") > SO,(R) 


surjective? 


Problem 19.16 (Cayley’s Parametrization) Verify that the map 
Kw F=(E-K)(E+K)! 


assigns a bijection between real skew-symmetric matrices K and real orthogonal 
matrices F such that SpecF J — 1. 


Problem 19.17 Prove that O,,(IR) and SO, (IR) are compact subsets of Mat, (IR). Are 
they path-connected? 


Problem 19.18 An operator F on Euclidean space R? with the standard inner 
product has matrix 


1/2 -J/3/2 0 J/2/2—/2/2 0 
(a) | ¥3/4 1/4 -V3/2], (by | 1/2) 1/2, -v2/2], 
3/4 J/3/4 1/2 1/2 1/2 »/2/2 


?7See formula (18.14) on p.470. 
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in the standard basis. Clarify whether F is a rotation or a rotation followed by a 
reflection in orthogonal plane to the rotation axis. In any case, find the axis and 
the angle of rotation. 


Problem 19.19 Let V be a real Euclidean vector space. Show that the map 
End(V) — Hom(V,V*) sending an operator F : V — V to the bilin- 
ear form Br(u,w) & (u, Fw) is an isomorphism of vector spaces that takes 
(anti) self-adjoint operators to (skew) symmetric forms. Deduce from this that 
every quadratic form on V can be written in some orthonormal basis of V as 


2 2 2 
AX) + a2Xy + +++ + 4,/X,. 


Show that up to renumbering, the coefficients a; do not depend on the choice of 
orthonormal basis. 


Problem 19.20 Among the rectangular parallelepipeds circumscribed about the 


2 2 
ellipsoid x} + 2 + 3 = 1 in the Euclidean space R’*, find the maximal and 


minimal lengths of their internal diagonals. 


Problem 19.21 Find both Euclidean polar decompositions of the matrices (a) 


2-1 14 
(; a) (45): 


Problem 19.22 (Criteria of Normality 2) Prove all the criteria of normality 
from Problem 19.9 for an R-linear operator F on a Euclidean space.”* 


Problem 19.23 Given a normal operator F' on a real Euclidean space V, prove 
(independently of Theorem 19.1 on p. 491) that: 


(a) The orthogonal to every eigenspace of F is F-invariant. 

(b) The operators F, FY are semisimple”’; moreover, V can be decomposed as an 
orthogonal direct sum of 1- and 2-dimensional subspaces invariant for both 
operators F and FY. 

(c) For dim V = 2 and irreducible F, the matrix of F in every orthonormal basis of 
V looks like 


(; 2) , whereb #0. 
ba 


(d) Solve Exercise 19.10 on p.493 independently of Theorem 19.1 on p. 491. 
(e) Deduce Theorem 15.2 on p.370 as well as Proposition 19.2 and Proposi- 
tion 19.3 on p. 492 from Exercise 19.10. 


8 Of course, F? and ||w]| should be replaced everywhere by FY and |w]. 
°See Proposition 15.2 on p. 368. 


Chapter 20 
Quaternions and Spinors 


20.1 Complex 2 x 2 Matrices and Quaternions 


20.1.1 Mat2(C) as the Complexification of Euclidean R4 


In Example 18.6 on p.473, we saw that the complex structures on V = R* that 
form the Hermitian plane C? from Euclidean R* are numbered by the lines on the 
Segre quadric! in P; = P(C*). To see this correspondence in detail, let us identify 
the complexified space W = Vc = C* with the space of complex 2 x 2 matrices 
Mat2(C) in which the Segre quadric lives. The space W = Mat(C) is equipped 
with a natural C-bilinear form 


det: WxW—>C, 


the polarization of the quadratic form det, which is the equation of the Segre quadric. 
We would like to think of det as the C-bilinear extension of the Euclidean structure 
on V from V to Vc = Matp(C). 

To write det explicitly, recall” that for every matrix , the formula y-nY = det n-E 
holds, where 7Y is the adjunct matrix? of 7. For 2 x 2 matrices, the map 


— (i ia nv ( N22 ) (20.1) 
N21 N22 —Na M1 


'See Example 17.6 on p. 439. 
See formula (9.29) on p. 221. 
3See Sect. 9.6 on p. 220. 


© Springer International Publishing AG 2016 503 
A.L. Gorodentsev, Algebra I, DOI 10.1007/978-3-319-45285-2_20 


504 20 Quaternions and Spinors 


is a C-linear involution. Therefore, the polarization of the determinant can be 
written as 


és 1 
det(7, 6) = 5 tne). (20.2) 


Exercise 20.1 Verify that det(n, ) is symmetric and nondegenerate. Write its 
Gramian in the standard basis* Ej in Matz. Check that the involution (20.1) is an 
antiautomorphism of the matrix algebra, meaning that (n¢)Y = ¢%NnY. 


If we replace in (20.2) the adjunct matrix nY by the Hermitian adjoint n* = 77, we 
get theC-sesquilinear conjugate-symmetric form 


1 1 = 
OE 5 wine!) = 5) i nibs (20.3) 
iy 


which provides Mat, (C) with a Hermitian structure such that the squared lengths of 
vectors equal 


; 1 
nl? = a) = 5 Doll, 


and the standard basis Ej is orthogonal with inner product squares ||E; |? = 1/2. We 
would like to think of this Hermitian inner product as the C-sesquilinear extension 
of the Euclidean structure on V from V to Vc = Mat(C). 

In fact, our two wishes uniquely determine the inclusion V <> Mat(C) as a real 
subspace of an appropriate real structure, because the C-bilinear and C-sesquilinear 
extensions of the same Euclidean inner product differ by complex conjugation of 
the second argument in the product. Therefore, a real structure 


o : Mat.(C) > Mat»(C) 


that has V as a +1-eigenspace should be the composition of the involution (20.1) 
with the Hermitian conjugation 


n= (" ea bn fa = es a (20.4) 
N21 22 Mo N22 


Exercise 20.2 Check that the involutions V and + commute, and deduce from this 
that V, +, their composition o = V e | = fe V, and the identity map together form 
the Klein four group.° 


4See Example 6.7 on p. 129. 
>See Example 12.4 on p. 284. 
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Since Hermitian adjunction is a C-antilinear matrix algebra antiautomorphism,° 
its composition with (20.1) is a C-antilinear matrix algebra automorphism: 


ocRS . ie a ( "hea —) : (20.5) 
N21 22 Nh M1 


Exercise 20.3 Check by direct computation that (,¢) = det(n, €°) and (nf)° = 
nce. 


20.1.2 Algebra of Quaternions 


A real subspace V = Re,(W) with real structure (20.5) on Mat2(C) consists of 
matrices 


— X, + ix. x2 + 1X3 
—X. + 1X3 X1 — 1X2 


) , where x, ER. (20.6) 


The restrictions of the R-bilinear forms (20.2), (20.3) to V coincide and assign there 
the Euclidean structure (x, x) = x7 + x3 + x3 + x}, an orthonormal basis of which 
is formed, for example, by the matrices 


10) ie f?P-O ere ca  fOR 
i= if je ke 20.7 
(01) , (a J oF Ca an) 


Since involution o respects the multiplication, the fixed points of o form an R- 
subalgebra in Mat,(C). It is called the algebra of quaternions and is denoted by H. 
The multiplication table of the basic quaternions (20.7) is 


(20.8) 
j= ji=k, jk=—-kj=i, ki=—-ik=j. 
Therefore, arbitrary quaternions are multiplied by the formula 
(Xo + x1 + Xof + x3k) + (Vo + yt + yas + y3k) = Ctoyo — X11 — X2y2 — x33) 
+ @oyi + x1y0 + X2y3 — X3y2)i + (oye + X2Y0 + x31 — X1y3)J 


+ (xoy3 + x3¥0 + x12 — xay)k, 
(20.9) 


That is, (no)? = C77. 
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which is just a “command line utility” for matrix multiplication of special matri- 
ces (20.6). 


Exercise 20.4 Given an abstract real vector space R* with the basis 1,i,j,k 
satisfying the multiplication table (20.8), try to verify that formula (20.9) provides 
this space R* with the structure of an associative R-algebra. 


20.1.3 Real and Pure Imaginary Quaternions 


By analogy with the complex numbers, the quaternions of the real line spanned 
by the unit R- 1 C H are called real, whereas the quaternions lying in the 3- 
dimensional subspace J = {x-i+y-j+z-k | x,y,z € R} are called pure imaginary. 
In the language of matrices, the pure imaginary quaternions are exactly the anti- 
Hermitian traceless matrices: J = {n € Mat,(C) | nt = —n, trn = O}. The real 
quaternions are the real scalar matrices R-E. Thus, Hermitian adjunction of matrices 
fixes all real quaternions and changes signs of all pure imaginary quaternions. In 
analogy with complex numbers, this operation is called quaternionic conjugation 
and traditionally is denoted by an asterisk: 


* def + 


gxo + XE + Xo t+ xgk RH g* Sq! = x0 — xb — xf — x3k. 
This is an antiautomorphism of the R-algebra H, i.e., (pg)* = g*p*. Note also 
that the real and imaginary quaternions are orthogonal in the Euclidean structure on 


H. Indeed, formula (20.9) shows that the Euclidean scalar product is related to the 
multiplication by the equalities 


(p,q) = Re(pq*) = Re(p*q), (20.10) 


which force (1,i) = (1,7) = (1,k) = 0. 


20.1.4 Quaternionic Norm 


Since the squared Euclidean length of a quaternion 7 = x9 + x1i + xoj + x3k is 
nothing but the determinant, ||7||? = > x? = (n,n) = det(n), the Euclidean length 
is multiplicative with respect to quaternionic multiplication: ||7¢|] = ||7||-||¢|| for all 
n,¢ € H. Traditionally, the length of a quaternion is called the quaternionic norm. 
The multiplicativity of the norm can also be seen directly from the relations (20.8) 
as follows. For every g € H, the product q- q* is self-adjoint and therefore real, that 
is, g:qg* = Re(q-q*). So, formula (20.10) written for p = q becomes the remarkable 
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equality 
llall” = 49". (20.11) 


similar to what we have for the complex numbers. The multiplicativity of the norm 
now follows: 


IIpall’ = pa(pq)* = pag*p* = pllg\l’p* = Ip" lal - 


It is quite astonishing that the coordinate form of this relation, called Euler’s four- 
square identity, 


(9 +37 +33 +93) 9 +9 +93 +99) 
= (xoyo — x11 — X22 — X3y3)” + Coy + X10 + x23 — x3Y2)" 


+ (xoy2 + x2v0 + x3¥1 — x1y3)* + (xoy3s + x3¥0 + x12 —x291)*, 
(20.12) 


appeared in 1748, in a letter from Euler to Christian Goldbach, almost a century 
before the multiplication table (20.8) was discovered by Sir William Rowan 
Hamilton in 1843. (Euler’s four-square identity was used by Joseph Louis Lagrange 
in 1770 to prove that every nonnegative integer can be represented as the sum of 
four perfect squares.) 


20.1.5 Division 


Another crucial consequence of (20.11) is the invertibility of every nonzero 
quaternion g € H. Namely, a quaternion g~! = q*/||q||* is clearly a two-sided 
inverse to g. Thus, H is an associative, noncommutative division algebra’ over the 
field R, meaning that the addition and multiplication of quaternions satisfy all the 
field axioms except for the commutativity of multiplication. 


20.2 Geometry of Quaternions 


Recall that the cross product of two vectors u, w in the Euclidean space R? equipped 
with the standard orientation is a vector u x w € R? uniquely determined by the 
following properties: 


70r skew field. 
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e The length |u x w| equals the Euclidean area of the parallelogram spanned by 
Uu, W. 
° Ifuxw #0, then u, w, u x w is an orthogonal basis of positive orientation. 


Equivalently, we could say that uxw € V is the Euclidean dual vector to the covector 
o(u,w,*):V>R, vr oa(u,w,v), 


where w is the standard volume form on R? assigning the unit volume to the standard 
basis. 


Exercise 20.5 Convince yourself of the equivalence of both definitions. Then for 
two vectors u = (U1, U2, U3), W = (WI, W2, W3), write the orthogonality relations 


(u, xX) = UX, + U2X2 + 3x3 = 0, 


(w, x) = wyxy + Wox2 + W3x3 = 0, 


in the unknown vector x = (x1,%2,x3) € R? and verify that the basic solution of 
these equations provided by Cramer’s rule from Proposition 9.5 on p. 224 coincides 
with the cross product: 


x= (i2W3 — W223, —UyW3 + W113, UjW2 — Wi U2 ) =uxXw. 


Lemma 20.1 Take i,j,k from (20.7) as the standard orienting basis in the 
Euclidean space of pure imaginary quaternions I ~ IR?. Then Im(pq) = p x q 
for all p, q € I. Inparticular, pq is real if and only if q = Ap for some 0 € R. 


Proof Since both maps p,g > p x q and p,q + Im(pq) are bilinear, it is enough 
to check the relation Im( pq) = p x q for nine pairs of basic vectors p, q = i, j,k. 
This is exactly the multiplication table (20.8). oO 


Lemma 20.2 Two arbitrary quaternions p,q € H are orthogonal if and only if 
pq* € I. Two pure imaginary quaternions p,q € I are orthogonal if and only if 
Pq = —@p, and in this case, pq = —qp € I is perpendicular to the plane spanned 
by P, 4. 


Proof The first statement follows directly from formula (20.10). All other claims 
follow from Lemma 20.1. oO 


Lemma 20.3 The solution set of the equation x> = —1 in His the unit sphere 
S? = {xe 1 | |x| = 1l}nI~ R’. 


' — —y, and therefore 


ty = -1. oO 


Proof The equation x* = —1 forces ||x|| = 1. Then x* = x— 


x € I. Conversely, for every x € S?, we have x” = —x*x = —x— 


Lemma 20.4 Three arbitrary quaternions i, j, k satisfy the relations (20.8) if and 
only if they form a positive orthonormal basis in I, where the orientation in I is given 
by the initial orthonormal basis (20.7). 
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Proof The relations i? = j? = k? = —1 mean that all three quaternions lie on the 
unit sphere S* C J. Then by Lemma 20.2, the relations i-j = k = —j-i mean that k 
is perpendicular to both i andj, and the orthonormal basis i, j, k is positive. Oo 


20.2.1 Universal Covering S* = SU, —» SO3(R) 


For matrices of determinant 1, the adjunct matrix coincides with the inverse. Hence 
the special unitary group 


SU, = {n € Mat (C) | det 7 =1& nt = nt} 


consists of all 7 € Mat2(C) with dety = 1 and nY = n’. The latter means that 7 is 
o-real and therefore lies in H. We conclude that SU2 coincides with the unit sphere 


S ={y eH | |vl=jcH=R*. 


This group acts on the quaternionic algebra H by conjugation. Namely, for every 
we S?, let’ 


Fy:H>H, qevqyv'. (20.13) 


Exercise 20.6 Check that Fy is an R-algebra automorphism and 
F:SU2—> Aut(H), Wr Fy, 


is a group homomorphism. 


Since det(wqw') = detg, the operator Fy is a Euclidean isometry of H. Since Fy 
preserves the central line R - 1, the space J = 1+ of pure imaginary quaternions 
is Fy-invariant. As y is continuously deformed to 1 within S?, the restricted 
orthogonal isometry Fy | es I > 1 is continuously deformed to Id; within Oge(/). 
Therefore, Fy € SOdge(/) ~ SO3(R). We get the group homomorphism 


S* = SU2 > SOge(J) ~ SO3(R), WH Fy, - (20.14) 
Since Fy (Ww) = w and F,(1) = 1, the operator Fy : H — H leaves fixed each 


vector in the 2-dimensional plane Ty = R- 1 @R- w. Therefore, Fy | pot > Tisa 
rotation about the line £y, © Ty, /. Let us fix one of two pure imaginary quaternions 


8Since w—! = w* for every W with ||y|| = 1, the same operator could be described by the 
formula Fy : gt waqy*. 
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of length 1 on this line? and denote it by J. We identify the real plane J7. y with the 
field C by 


Ca (a+iy)<—> @+yl) € T,. (20.15) 


Such an identification provides our quaternion y € IT, ~ C with an argument 
a = Arg wy uniquely determined by the equality yy = cosa + l- sina. Therefore, 
wv! = cosa —I- sina. 

Lemma 20.5 The operator Fy F € SOget(Z) is rotation about the line ty by the 
angle 2 Arg(¥) viewed in the direction of the vectorl € Ly. 


Proof Fit the vector / into the positively oriented orthonormal basis J, m, n in I. By 


Lemma 20.4, the multiplication table of quaternions J, m,n is P = m? = n? = 


Imn = —1 as in (20.8). Therefore, 
wmwy—' = (cosa +1- sina)m(cosa —1- sina) 
= (mcosa +n-sina)(cosa —1- sina) 
= m(cos* a — sin? @) + 2n cosa sina = mcos(2a) + nsin(2a), 
wny! = (cosa + 1- sina)n(cosa —1- sina) 
= (ncosa —m-sina)(cosa —I1- sina) 


= n(cos* a — sin? @) — 2mcosa@ sina = ncos(2@) — msin(2@) . 


oO 


Thus, Fy acts on the vectors (m,n) by the matrix (Ree 7 mand : 


sin(2a) cos(2a) 


Corollary 20.1 The group homomorphism (20.14) is surjective with kernel {+1} ~ 
Z/(2). Oo 


20.2.2 Topological Comment 


In topological language, the homomorphism (20.14) is a double covering of SO3(R) 
by the sphere S. Since the map (20.14) identifies the diametrically opposite points 
of the sphere, its image is homeomorphic to the real projective space P3; = 
P(H). Hence, SO3(R) is homeomorphic to P(R*). Since the sphere S* is simply 
connected! and the group SO3(R) is path-connected, the covering (20.14) is a 
universal covering. This means, in particular, that 7,(SO3) = Z/(2), i.e., there 


These two quaternions are the intersection points £y 9 S? and are opposites of each other. 
‘0That is, the fundamental group 7 (S°) is trivial. 
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is a smooth noncontractible loop within SO3(R) that becomes contractible after 
being traversed twice. Such a loop, that is, a rotation, which varies depending on 
time and becomes the identity after some time, can be visualized as follows. Put 
a book in the palm of your hand and rotate it slowly through 360° keeping it 
horizontal all the time, as in Fig. 20.1, which is borrowed from a remarkable book 
by George K. Francis.'' After a full turn, your hand becomes exactly that loop in 
SO3. A troublesome tension in your elbow joint distinctly witnesses against the 
contractibility of this loop. However, if you overcome the discomfort and continue to 
rotate the book in the same direction, then during the next turn, your hand straightens 
itself and returns to the starting “contracted” position. 


Fig. 20.1 Rotating a book 


114 Topological Picturebook [Fr]. 
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20.2.3. Two Pencils of Hermitian Structures 


By Lemma 20.3, every pure imaginary quaternion n of norm | has n? = —1. 
Therefore, left and right multiplication by such quaternions provides H = R* with 
two families of complex structures: 


:H>H, nenn, and 7: H>H, neon, (20.16) 


compatible with the Euclidean structure!” on H and numbered by the points n € S? 
on the unit sphere in J ~ R?. The real plane 7, = R- 1 @ R-7n is invariant with 
respect to both operators (20.16) and can be identified with the field C by the same 
rule x + iy <= x + yn as in formula (20.15) above. Choose some m € S? 1M iit. 
Then H, considered a 2-dimensional vector space over C, can be decomposed into 
the direct sum of two 1-dimensional subspaces!? 


C-1@C-m=H=1-COm.-C. (20.17) 


Multiplication by the complex number i € C acts on left and right decomposition 
respectively as left and right multiplication by n. These actions coincide on C- 1 = 
TIn = 1+C and are conjugate to each other on C-m = [1+ = m-C, because 
I,(m) = nm = —mn = —I'\(m). Therefore, the two complex structures (20.16) 
are distinct. Clearly, each of them differs from all structures [/, I)’ with 1 # +n, 
because the latter do not send the plane JT, to itself. The operators //, and I’_,, = —I/, 
provide the real plane /7,, = [7_, with conjugate complex structures and therefore 
are distinct as well. The same holds for the operators J? and I”,, = —J/ as well as 
for the operators J), and J". Thus, we see that all complex structures (20.16) are in 
fact distinct. 

Note that this agrees with Example 18.6 on p.473, because both line rulings of 
the Segre quadric are parametrized by P,;(C) ~ S? (see Fig. 11.4 in Example 11.1 
on p. 256). 


Exercise 20.7 Convince yourself that the decomposition (20.17) is much like the 
presentation of the complex number field in the form C = R @ iR. Namely, one 
could define H as a “double C,” that is, a set of formal records z + w-j, where 
z,w € C and the symbol j has j7 = —1 and is C antilinear: jz = 7. In other words, 
the records are multiplied according to the rule 


(1 + wy -J)- (2 + w2-J) = (Z1Z2 — W1W2) + (Z1W2 + W122) -J- 


'2Tn the sense of Proposition 18.4 on p.472; essentially this means that the Euclidean length of a 
vector coincides with its Hermitian norm. 


'3The left decomposition is valid in the complex structure J’, whereas the right holds for 1’”. 
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20.3 Spinors 


20.3.1 Geometry of Hermitian Enhancements of Euclidean R* 


Let us return to the notation from Example 17.6 on p. 439, that is, we shall consider 
a 2-dimensional Hermitian vector space U = C? and put W = Endc(U). This 
W again becomes Mat,(C) as soon as we fix some orthonormal basis in U and 
write all the operators 7 : U — U in this basis using matrices. The points on the 
two dual projective lines PX = P(U*) and P, = P(U) are called spinors.'* They 
are in bijection with the complex structures (20.16) on H as follows. The Segre 
embedding! 


PX xP; > Z(det) CP(W), (€&v)  &@veEndV), soae 
where €@v:U—>U, ut &(u)-v, aia 


sends the coordinate lines € x P; and P¥ x v to the lines on the Segre quadric 
Z(det) C P(W). The latter lines are projectivizations of the 2-dimensional subspaces 


Ur ={F:U—>U|kerF =Anné} and U,={F:U>U|imF=C-v} 


in W. Write I, : V — V for the complex structure provided by U, in accordance 
with Proposition 18.4 on p.472 and let i : W — W denote the complexified 
operator. Then the +/ and —i eigenspaces of i are U, and o(U,,), where 


a0:W->W 


is the real structure defined in formula (20.5) on p.505. The Segre quadric has no 
o-real points, because the restriction of the quadratic form det: W > CtoVisa 
positive Euclidean form. Therefore, o sends every line on Z(det) to a line from the 
same ruling family.'° In other words, o acts on the spinor lines P*, P,, and the o- 
conjugate subspaces U,, o(U,) = Uy) come from some o-conjugate spinors v, 
o(v). Since U, and U,(y) consist of rank-1 operators F : U — U whose images are 
spanned by v and o(v), the operator © : W + W is forced to be left multiplication 
by the operator 


ty UU, ee (20.19) 
o(v)  —io(v) = o(iv). 


By construction, this operator is o-invariant, i.e., lies in V = H. We conclude that 
the complex structure 7, : H — H acts as left multiplication by the quaternion 
Wy € H defined in formula (20.19). This quaternion is also called a spinor. 


'4Physicists like to say that these two families of dual spinors are of opposite chiralities. 
See 17.13 on p. 439. 


16Tf the lines £ and o() lie in different families, then their intersection £M o(€) is a real point of 
Z(det). 
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Symmetrically, write o* : U* — U* for the dual action of o on U*. Then 
the det-isotropic subspaces Uz and o(U;) = U,*(¢) come from some o*-conjugate 
spinors € and o*(€) = & © o. They consist of rank-1 operators F : U > U taking 
u € U respectively to &(u) - w’ and to (au) - w” for some w’, w” € W. Therefore, 
I sends F: U > Uto Fe We, where y; : U — U satisfies the conditions 


STP ae abe 20.20 
Me Vesa te igor = £0 (0). — 


By construction, yw; € H and the complex structure ; : H — H acts as right 
multiplication by w¢. The quaternion ye is called a spinor’! as well. 


20.3.2 Explicit Formulas 


Fix a standard basis in U = C? and write the vectors v € U as coordinate columns 
and covectors £ € U* and coordinate rows. The spinor space U is equipped with 
a C-linear skew-symmetric correlation 6 : U > U*, 6* = —6, provided by the 
nondegenerate skew-symmetric form!® det. It takes a vector v € U to the covector 


P Vo 
du: ub det(u,v) = upd; —uyvo, ie, 6: ( 


) F> (v1, —v0). 


U1 


The involution F ++ FY = 5-!F*65 on W = End(U) is the (right) adjunction of 
operators!” by the correlation 6. Also, there is a C-antilinear correlation h : U > U* 
provided by the Hermitian inner product (* , * )y on U. It takes a vector v € U to 
the covector hu : ub (u, v)q = Undo + W101, Le., 


h: (:") F> (Uo, 01). 
Vj 


The involution F + Ft = h7!F*h on W takes F to the Hermitian adjoint operator 
with respect to h. The real structure o = Vo + on W maps F + F? = FY? = 
h-'8*F8*—'h = (8-'h)'F5“'h, i.e., conjugates endomorphisms of U by the C- 
antilinear involution 


oy =6'h:U SU, (*”) ss ey . 
U1 vo 


Of chirality other than yy. 
'8See Example 6.4 on p. 125. 
'See Sect. 16.3, especially formula (16.24) on p. 399. 
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Note that it has no fixed points on P; = P(U). To find the matrix y, that corresponds 
to the spinor v = (z : z,) € P), let us normalize v € U by the condition” 
(v, v)H => 2020 + ZZ] = 1. Then 


det & By) =1 (20.21) 


as well. The operator w, has a diagonal matrix with eigenvalues +i, —i in the basis 
v, ov. Therefore, in the standard basis of U = C?, the matrix of Wy is equal to 


& ) é °) & aa = fe —|zi)? — 2zo% ) 
Zz % J \O-i/ \a Zo 7 oz — lzal? —|zol? 
Thus, the complex structure /, : H — H corresponding to the spinor 

v = (zo: 21) € Py 
acts as left multiplication by the pure imaginary quaternion 


Wy = (lzol? — |zi]?) -é + 2z0%1 «ke (20.22) 


Exercise 20.8 Check that ||y|| = 1 and verify that the complex structure J; 
provided by the spinor € = dv = (—z, : zo) € Pf acts on Has right multiplication 
by Wp. 


20.3.3 Hopf Bundle 


The correspondence v +> y, described above in fact maps the 3-sphere 
S={veuU|(v.vy=}CC’~R* 
onto the 2-sphere of pure imaginary quaternions of norm 1: 
S ={gel|ld|=cr~R’. 


The quaternion yw, in (20.22) does not depend on the phase of the spinor v, i.e., it is 
not changed under rescaling v + Hv by } € U; = S! C C. Thus, the fibers of the 
map 


SoS, veEwW, (20.23) 


This can be done via multiplication of v by an appropriate positive real constant. 
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are unit circles within S*. The map (20.23) is called the Hopf bundle. In purely 
topological terms, it is described as follows. The complex projective line P(C”) = 
P, ~ S? is the space of nonzero orbits for the action of the multiplicative group C* 
on C? by scalar dilatations. Since C* = R*, x Uj, the quotient map C* ~ 0 —> P, 
can be factored into the composition of the quotient map 


C\0O>C’?0/RE, 2S? 


followed by the quotient map S? —» S? = S? ~ 0/ U1. The first takes a nonzero 
vector u € C” to the unique v = Au with A € Ryo such that (v, v)y = 1. This is 
exactly the normalization made in formula (20.21) above. The second quotient map, 
which factorizes the unit sphere by phase shifts, is the Hopf bundle. 


Problems for Independent Solution to Chap. 20 


Problem 20.1 Give a basis in Mat2(C) in which the Gramian of the quadratic form 
det is diagonal with diagonal elements (+1,—1,—1,—1). 


Problem 20.2 For q € H, let C(g) = {w € H | wq = quw}. For all possible values 
of dimp C(q), describe all quaternions g € H with that value. In particular, show 
that the center Z(H) is equal to R- 1. 


Problem 20.3 Show that every nonreal quaternion is a root of a quadratic polyno- 
mial with real coefficients and negative discriminant. 


Problem 20.4 Solve the following systems of linear equations in x, y € H: 


k=(i+js-x+U+h)-y, : k= (1+i)-x+4+/-y, 
i=(1+i-x+(+4)-y, i=(l+)-x+k-y. 


Problem 20.5 Check that the subspaces 
{q €H| q* =—g} and {g¢ Hq’ € Reo- 1} 


coincide. Denote them by J C H. Show that 


def 


(a) the map x, y +> [x,y] = xy — yx sends / x / to J, 

(b) the R-bilinear form (p,q) = (pq* + qp*)/2 provides J with a Euclidean 
structure, es 

(c) this Euclidean structure coincides with detl|;, 

(d) [x,y] =x xy forallx,y €/. 


Problem 20.6 For every nonzero a € H, show that the conjugation map 


a: H > Hyg aga, 
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is an automorphism of the R-algebra H and a Euclidean isometry of the subspace 
of pure imaginary quaternions J C H. 


Problem 20.7 For a given complex 2 x 2 matrix a € SU C H, write down an 
explicit real 3 x 3 matrix of the orthogonal operator gy : J — I from Problem 20.6 
in the basis i, j, k. 


Problem 20.8 For two matrices 2), g2 € SL2(C), write We... € Endc (Mat2(C)) 
for the linear endomorphism X b> giXg5!. Check that the assignment 


(g1, 82) > We1,29 


yields a well-defined group homomorphism 
SL2(C) x SLo(C) > SOder (Mats (C)) . 


Describe its kernel and its image. For given matrices g;, g2 € SL2(C), write down 
an explicit real orthogonal 4 x 4 matrix of the operator Wz, ., in the basis you have 
constructed in Problem 20.1. 


Problem 20.9 For two quaternions p,q € H of unit norm |/g|| = |v] = 1, 
write \,, : H — H for the linear endomorphism 7 +> pnq~!. Check that the 
map (p,q) +> Wpq is a well-defined group epimorphism U2 x U2 — SOx(R). 
Describe its kernel. For given quaternions p, g, write down an explicit real 
orthogonal 4 x 4 matrix of the operator y,, in the basis 1, i, j, k. 


Problem 20.10 Write e ¢ H for the unit of the quaternionic algebra. Prove that the 
following collections of quaternions form multiplicative subgroups in H: 


(a) the 8 quaternions te, +i, +, +k. 

(b) the 16 quaternions (te +itj+k)/2. 

(c) the 24 quaternions obtained from (+e +i)//2 by permutations of the symbols 
e, i,j, k. 

(d) the 24 quaternions obtained by combining the two groups (a), (b) into one. 

(e) the 120 quaternions obtained as the union of the group (d) with 96 quaternions 
produced from (+e + ai + a~'j)/2, where a = (1 + V/5)/2, by all even 
permutations of the symbols e, i, j, k. 


21 


Problem 20.11 (Binary Group of the Icosahedron) Show that the group 
from Problem 20.10 is isomorphic to SL (Fs) and admits a_ surjective 
homomorphism onto As. 


Problem 20.12 (Octaplex) Let C+ C R* be the standard regular cocube, i.e., the 
convex hull of the standard basis vectors and their opposites. Let Q* C R* be 
a regular cube with the same center as C*, with the usual orientation of faces 
perpendicular to the coordinate axes, but inscribed in the same unit sphere as C*. 
The convex hull of the united vertices of C+ and Q* is called an octaplex and 


*1Compare with Problem 20.12 below. 
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denoted by O*. Prove that O* is a regular polyhedron, meaning that its complete 
group Og: acts transitively on the set of complete flags formed by a vertex, an 
edge joined with this vertex, a 2-dimensional face joined with this edge, etc. 
Calculate: (a) the total number of faces in each dimension, (b) the lengths of the 
edges and the radius of the inscribed sphere, (c) the area of the 2-dimensional 
face (and tell which polygon it is), (d) the volume of the 3-dimensional face 
(and tell which polyhedron it is), (e) the total volume of O*, (f) the order of the 
complete octaplex group |Oc:|. 
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Exercise 1.1 Answer: 2”. 


Exercise 1.2 The answer to the second question is negative. Indeed, let X = {1, 2}, 
Y = {2}. Using unions and intersections, we can obtain only the sets 


XNY=YNY=YUY=Y, 
XUY=XUX=XNX=X. 


Thus, each formula built from X, Y, M, and U produces either X = {1,2} or Y = {2}. 
But X~ Y = {1}. 

Exercise 1.3 There are six surjections {0, 1,2} —» {0, 1} and no injections. 
Symmetrically, there are six injections {0, 1} — {0, 1, 2} and no surjections. 


Exercise 1.5 If X is finite, then a map X — X that is either injective or surjective is 
automatically bijective. Every infinite set X contains a subset isomorphic to the set 
of positive integers N, which admits the nonsurjective injection n + (n + 1) and 
the noninjective surjection 1 + 1,nt» (n— 1) forn = 2. Both can be extended to 
maps X — X by the identity actionon XN. 


Exercise 1.6 Use Cantor’s diagonal argument: assume that all bijections N > N are 
numbered by positive integers, run through this list and construct a bijection that 
sends each k = 1, 2, 3, ... to a number different from the image of k under the 
k-th bijection in the list. 


. Hint: the summands are 


° — — —1)! 
Exercise 1.7 Answer: ("'"') = ("t""!) = vite de 


in bijection with the numbered collections of nonnegative integers (k,, k2,..., km) 
such that }° k; = n. Such a collection is encoded by the word consisting of (m— 1) 


letters 0 and n letters 1: write k; ones, then zero, then kz ones, then zero, etc. 


Exercise 1.8 Answer: (ey A diagram is a broken line going from the bottom 


left-hand corner of the rectangle to its upper right-hand corner and consisting of n 
horizontal and k vertical edges. 
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Exercise 1.9 If z is equivalent to both x and y, then each equivalence u ~ x implies 
by transitivity and symmetry the equivalence u ~ y and conversely. 

Exercise 1.10 Let [x’], = [x], and [yn = [y]n, that is, x = x+ nk, y’ = y+né for 
some k,£ € Z. Thenx’+y =x+y+n(k+ £) andxy’ = xy + n(€xt+ ky + ken) 
are congruent modulo n to x + y and xy respectively, i.e., [x + W]n = [x + y]n and 
[yn = Pxy]n- 

Exercise 1.11 Say that x ~ y if there exists a chain satisfying the conditions from 
Exercise 1.11. Verify that this is an equivalence relation and check that it is a subset 
of every equivalence relation containing R. 

Exercise 1.12 Check of transitivity: if (p,q) ~ (r,s) and (r,s) ~ (u,w), Le., 
ps—rq = 0 = us—rw, then psw—rqw = 0 = usq—rwa, which forces s( pw—uq) = 
0 and pw = uq, i.e., (p,q) ~ (uw). 

Exercise 1.13 Let a be the smaller of the two angles between £; and £2. Then 
reflection in ¢; followed by reflection in £; is a rotation about O = ¢; NM ¢; through 
the angle 2a in the direction from ¢; to ¢;. Thus, 002 = 020; if and only if the lines 
are perpendicular. 


Exercise 1.14 The table of products gf is as follows: 


ov (1,2,3) 0,3,2) (3,2, 1) (2,1,3) (2,3, 1) G,1, 2) 


2, I 23) 43.2) B23.) C13) 23.1) 6D 
(U29103.2 02,3) 6.12703, 2.143)6:2,.0 
(3:2.1)13;2,. 1) @:3,1) 0.2, 3) G, 2) (3,2) 031,3) 
2, 1,3)10,1,3) (1,2) @, 3,1) 052-3) G31), 3,2) 
(2,3, 1)1(2,3, 1), 2,1) 2; 1,3) 4, 3,2) G,1,2) G,2,3) 
(3.1;2)16,1,2)-24,3) 03,2) G21) G52,3)-C:3,1) 


Exercise 1.16 Let x € W be the minimal element in the set of w € W for 
which »(w) fails. Since »'(w) holds for all w < x, then »'(x) must hold as well. 
Contradiction. 


Exercise 1.19 The axiom of choice allows us to choose an upper bound b(W) € P 
for every W € W(P). If f(x) > x for all x € P, then the map 


W(P) > P,W + f(b(W)), 


contradicts Lemma 1.2 on p. 15. 
Exercise 2.2 Answers: 1 + xandxy+x+y. 


Exercise 2.3 It is enough to verify the invariance of the equivalence classes of the 
results only under changes of fractions by means of the generating relations (2.12), 
ie., by Ft 
Exercise 2.5 Use increasing induction on k starting from k = 0 to verify that all 
E, belong to (a,b) (thus, all Ey are divisible by GCD(a, b)). Then use decreasing 


Hints to Selected Exercises 521 


induction on k starting from k = r+ | to verify that all E; are divisible by E,, (thus, 
Eo = a, E, = b, and GcD(a, b) = ax + by are divisible by E,.). 

Exercise 2.8 The existence of a factorization can be proved by induction on |n!|: 
If n is prime, its prime factorization is n = n; if not, thenn = ny, - n2, where 
|71|, |n2| < n. Thus, 1, nz are factorizable by induction. The proof of the uniques 
is based on the following: for all z € Z and prime p € Z, either GCD(z, p) = |p| 
and p | z or GCD(z,p) = | and p, z are coprime. Given two coinciding products 
Pip2°**Pk = 9192°**m, it follows from Lemma 2.3 on p. 26 that p; cannot be 
coprime to each q;, because p, divides | | gi. Thus, p; divides some q;, say q1. Since 
q is prime, q; = mike Cancel q; and p; and — the as 

Exercise2.9 a7! = ),5)(—D*a* = 1-a+a’—a> +--+ (the sum is finite, 
because a is cae, 

Exercise 2.10 The residue class G n) (mod p) is equal to the coefficient of x?" i 
the expansion of the binomial (1 + x)”"” over the finite field F, = Z/(p). Since the 
map a +> a? respects sums over F,, its n-fold iteration yields (1 + xy" =14+". 
Hence (1 + x)?"”" = (1 + wy" = 1+ mx?" + higher powers. 

Exercise 2.12 The axioms are be checked componentwise, and the clearly hold 
because each K, is a ring. 

Exercise 2.13 An element (a,b) € F, x F, is invertible if and only if a 4 0 and 
b # 0. A nonzero element (a,b) € F, x F, divides zero if and only if a = 0 or 
b#0. 

Exercise 2.17 Both statements follow from Lemma 2.5 on p. 33, which says that 
every ring homomorphism to an integral domain sends the unit element to the unit 
element. 

Exercise 2.18 Since the polynomial x? — x is nonzero and of degree p, it has at 
most p roots! in a field F. Hence, the fixed points of the Frobenius endomorphism 
F, : F — F are exhausted by p elements of the prime subfield F, C F. 


Exercise 3.3 (y" —x")\/(y = x) = a ae er: ae es ahs ens. Ge a fe xl : 
k 
Exercise 3.5 Let f(x) = )> ax‘. Th j=) oo Sy PO) 
xercise et f(x) = 0 ax’. Then f(x+2) ak * f(x) 


kv 
where 


fp) = >a(') i = le ae 


kev 


Exercise 3.6 The right-hand side is nothing but the output of long division of f 
by x — a. However, now that it has been written down, the equality can easily be 
checked by straightforward expansion. 


Exercise 3.7 Completely similar to Exercise 2.8. 


'See Sect. 3.2 on p. 50. 
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Exercise 3.8 Completely similar to Exercise 2.5 on p. 25. 
Exercise 3.9 A reducible polynomial of degree < 3 has a divisor of degree 1. 


Exercise 3.10 Uniqueness follows from Corollary 3.3 on p. 50. To construct f, note 
that 


fi) = [[@-a) 
vAi 


vanishes at all points a, except for a;, at which f;(a;) # 0. Thus g;(x) = fi()/fi(ai) 
satisfies 


1 forv =i, 
gi(ay) = ; 
0 otherwise. 


Hence f(x) = digi + bogo +-++ + bngn = YoiX9 bi I].4: 7 solves the problem. 
Exercise 3.12 Completely similar to Exercise 1.10 on p.9. 


Exercise 3.13. The canonical embedding gy : k — k[x]/( — @) sends a € k to 
[~] = |x]. This forces the equalities [g(x)] = g([x]) = g([o]) = [g(@)] for all 
g © k[x]. Thus, k[x]/(« — @) is exhausted by the classes of constants. That is, the 
monomorphism @¢ is surjective. 


Exercise 3.14 The inverse of a nonzero expression a + b/2 € Q[V2] is 


a b 
a—2b2 a*—2b2 


V2. 


The ring in (a) is not a field, because it has zero divisors: 
[x + 1]- fx? —x+ 1] = [0]. 


The ring in (b) is a field by Proposition 3.8. 

Exercise 3.15 Use the Euclidean algorithm, which produces h(x), g(x) € such that 
h(x) (x — a) + g(x)(x? +x + 1) = 1. Since the remainder on division of x7 +x+ 1 
by x—ais a? +a-+1, the algorithm stops at the second step. 

Exercise 3.17 Let vy = 0, + 27k, Bs = By + 2mk>. Then 3; + By = vi + v5 + 
21 (k, + ko), as required. 

Exercise 3.18 The complex number ¢ = cos(27/5) + i- sin(2z/5) is a root of the 
polynomial 


2P—-1l=(¢-1)(44+24242z41). 


The reciprocal equation z+ + 2? + 22 + z+ 1 = Ocan be solved in radicals via 
division of both of sides by z* and a change of variable z+ t= z+z!. 
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Exercise3.19 Write € = ¢; = cos(2m/n) + isin(27/n) for the root with the 
smallest positive argument and put € = ¢4, n = ¢9°°%"), Prove the following 
stronger fact: The sets of all integer powers of € and n coincide. Hint: ¢” = & 
means that m = kx + ny for some x, y € Z. 

Exercise 3.20 The equality z;z. = 1 in Z[i] forces |z1|-|z2| = 1 because of |z|? € N 
for all z € Z[i]. 

Exercise 3.21 Collect in , all the prime divisors of m; whose multiplicity in m, is 
greater than the multiplicity in mp. 


Exercise 3.23 Write the elements of F, in a row as 


—[(p- 1/2], .... — TW, (0). 1). --. le - 1/2] 


and check that a € ir is a square if and only if the number of “positive” elements 
that become “negative” after multiplication by a is even (this fact is known as 
Gauss’s lemma on quadratic residues). Then apply this to a = 2. 


Exercise 4.1 The equality of the fractions p/q = r/s means the equality ps = qr 
in k[x]. If both representations are simplified, then p and q are coprime, s and r 
are coprime, and q, s are both monic. It follows from 2.3 that p = rf and q = sg 
for some f,g € kx]. The equality frs = grs forces f = g. Since GCD(p,q) = 
GCD(rg, sg) = 1, the polynomial g is an invertible constant, forced to be 1 because 
q and s are both monic. 

Exercise 4.3 Compare with formula (3.4) on p. 43. 

Exercise 4.4 The rule for differentiating a composition of functions from for- 
mula (3.11) on p. 46 implies that (f”)’ = m-f”—! - f’ for every f. Thus, 


d —m _ a oe —x)y1 
a -((4))- oa 


Now the required formula can be checked by induction. 


Exercise 4.7 Differentiate both sides. 


Exercise 4.12 Answers: a; = 4,42 = 4,43 =0,a4 =—t,as=0,a=4, 
a7 = 0,43 =-H a =0,a0 = Sa =0,an =-, 
S4(n) = n(n + 1)(2n + 1)n? + 3n— 1)/30, 
Ss(n) = n?(n + 1)?(2n + 1)(2n? + 2n — 1)/12, 
Si9(1000) = 91 409 924 241 424 243 424 241 924 242 500. 
Exercise 4.14 Given a finite set fi,fo,....fm of fractional power series f; € 


k(x!/%)), they all can be considered simultaneously as elements of the Laurent series 
field k(x!/t™(iis2---fm))), where all the properties of sums and products involving 
the set fi, fo,...,fm are clearly satisfied. 
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Exercise 4.15 Let m and 3(t) solve the modified problem. In the case of the first 
modification, they solve the original problem as well. In the case of the second and 
third modifications, the original problem is solved by the same m and by the series 
0 (t)/ay(t2) and 3(t) + ay) (t4)/n respectively. 

Exercise 4.17 Compare with the proof of Proposition 4.5 on p.99 and remarks 
before it. 


Exercise 5.3 Let polynomials f(x), g(x) € J have degrees m = n and leading 
coefficients a, b. Then a + b equals either zero or the leading coefficient of the 
polynomial f(x) + x"~" - g(x) € J of degree m. Similarly, for every a € K, the 
product aa is either zero or the leading coefficient of the polynomial wf (x) € I of 
degree m. 


Exercise 5.4 Repeat the arguments from the proof of Theorem 5.1 but use the lowest 
nonzero terms instead of the leading ones. 


Exercise 5.6 sin(2mx)/[]K__,(x—@) € k\ Ie-1. 


a=—k 
Exercise 5.7 Write J, for the set of analytic functions vanishing at each point of 
Z Cc C except for integer points m in the range —k <m<k. Thenlo © CLS 
- is an infinite chain of strictly increasing ideals, because of 


k 
sin(27z)/ I] (c-a)eE whe. 


a=—k 


Exercise 5.8 Since / is an additive subgroup of K, the congruence relation a; = 
az (mod /) is obviously symmetric, reflexive, and transitive. The correctness of the 
definitions (5.3) is verified as in Exercise 1.10 on p.9: if a’ =a+x,b' =b+y for 
some x,y € I, thena’ +b’ =a+b+(x+y) anda’b’ = ab + (ay + bx + xy) are 
congruent modulo / to a + b and ab respectively. 

Exercise 5.9 Write z : K —> K/I for the quotient homomorphism. For an ideal J C 
K/T, its preimage a (J) is an ideal in K. It is generated by some aj, d2,...,Am € 
K. Verify that their residues [a,]; span J in K/T. 

Exercise 5.11 Each ideal in C[x] is principal.” If the quotient ring C[x]/(f) is a 
field, then f is irreducible. The monic irreducible polynomials in C[x] are exhausted 
by the linear binomials x — a, where a € C. The principal ideal (x — a) is equal to 
kerevy, because f(a) = 0 if and only if x — a divides f(x). In R[x], the principal 
ideal m = (x? + 1) is not equal to kerev, for every a € R, but R[x]/(x* + 1) ~ C 
is a field. 


Exercise 5.12 Use compactness’ of the segment [0, 1] to show that for every proper 
ideal J C C, there exists a point p € [0,1] where all functions from J vanish 


*Namely, it is generated by the monic polynomial of minimal degree contained in the ideal. 
Every open cover contains a finite subcover. 
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simultaneously.* This gives an inclusion / C ker ev,, which has to be an equality as 
soon as J is maximal. The same holds if [0, 1] is replaced by any compact set X. 


Exercise 5.14 If for each k there is x, € I, ~ p, then xj + x2 +++ Xm € ()Ik C Pp, 
whereas x, ¢ p for all k. Thus, p is not prime. 


Exercise 5.15 In (c) and (d), the degree function is multiplicative: v(z)z2) = 
v(z1)v(z2) and v(z) = 1 for all z 4 0. Hence, v(zz2) = v(z). For every z € C, 
there exists w from the ring in question such that |z — w| < 1. If we take such w 
for z = a/b, we get |a — bw| < |b|. Therefore, property (5.6) holds for g = w and 
r=a-—bw. 

Exercise 5.16 If 3 b~', then v(ab) < v(abb™) = v(a). Conversely, if v(ab) = 
v(a), then on division of a by ab, we get a = abq +r, where either v(r) < v(ab) = 
v(a) or r = O. On the other hand, r = a(1 — bq) implies v(r) = v(a) as soon 
as 1 — bq # 0. We conclude that either 1 — bg = 0 or r = O. In the latter case, 
a(1 — bq) = 0, which forces 1 — bq = 0 too. Thus, bg = 1. 

Exercise5.17 If b = ax and a = by = axy, then a(1 — xy) = 0, which forces 
xy = las soonasa # 0. 

Exercise 5.18 The polynomials x, y € Q|x, y] have no common divisors except for 


constants. Similarly, the polynomials 2,x € Z[x] have no common divisors except 
for +1. 


Exercise 5.21 Look at the equality agg" + a,q” '!p+ +++ +an—1gp" | +a,p" = 0. 
Exercise 5.22 Answer: (x? — 2x + 2) (x? + 2x + 2). 

Exercise6.1 If0-v = w,thenw+v=0-v4+1-v=(041)-v=1-v=v. 
Add —v to both sides and get w = 0. The equality 0- v = 0 implies that A -0 = 
A(0-v) = (A-0)-v = 0-v = 0. The computation (-1)-v + v = (-1)-v+1l-v= 


((-1) + 1)-v = 0-v = 0 shows that (—1)-v = -v. 

Exercise 6.3 Use increasing induction on m to show that each monomial x” is a 
linear combination of fo, fi, ...,fn. This implies that fo, fi,...,f, span K[x]<,. Then 
let )° Af; = > wf; and compare the coefficients of x” on both sides. Conclude 
that Afr = [nfn can be canceled. Carry out this cancellation and compare the 
coefficients at x”—!, etc. 


Exercise 6.4 If a vector space V over Q has a countable basis, then as a set, V has 
the cardinality of the set ),,) Q” of finite sequences of rational numbers, which is 
countable. But the set K[x] has the cardinality of the set QN of infinite sequences of 
rational numbers, which is uncountable by Cantor’s theorem. 


Exercise 6.5 Let V be generated by n vectors. By Lemma 6.2 on p. 132, there are 
at most n linearly independent vectors in V. Conversely, let e1,é2,...,e, € V be 
a linearly independent set of maximal cardinality. Then for every v € V, there 
exists a linear relation Av + Aye, + Azre2 +--+ + Anen = 0, where A # 0. Hence, 
v=>> A~'kie;, and therefore the vectors v; generate V. 


4In the contrary case, there exists a finite collection of functions f),f2,..., tn © I vanishing 
nowhere simultaneously. Then }~f? € J vanishes nowhere, i.c., it is invertible. This forces J = C. 
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Exercise 6.6 If Fp» C F,m, then p” = ( p")? = p™, where d = dimp,, Fyn. 
Exercise 6.7 Let A,vj + Agv2 +++: + AmUm = 0 for some v; € J = Uh, 
A; € k. Since each v; belongs to some /,,, all the v; belong to J, for p = 
max(v1, V2,..., Vm). Since J,, is linearly independent, all the A; are equal to zero. 


Exercise 6.8 Write P for the set of all pairs (D,J/) such that D and J have the 
same cardinality, D C S, J Cc J, and J U (S ~ D) spans V. The first step in the 
proof of Lemma 6.2 on p. 132 shows that P # ©. The set P is partially ordered 
by the relation (D,/J) < (D’,J’), meaning that D C D’ and J Cc J’. Show that P 
is complete.” By Zorn’s lemma, there exists a maximal pair (Dmax,Jmax) € P. If 
Jmax % I, then the same arguments as in the proof of Lemma 6.2 show that Dinax 4 
S, and they allow us to add one more vector to Dmax and Jmax, in contradiction to the 
maximality of (Dax, Jmax): 


Exercise 6.11 This is a special case of Proposition 7.3 on p. 162. Fix some basis in 
W and extend it to a basis in V by some vector e, and put & as the element of the 
dual basis corresponding to e. For any two linear forms g, y vanishing on W the 
linear combination g(e)w — w(e)@ vanishes identically on V. 

Exercise 6.12 This follows from Theorem 7.2 on p. 162. 

Exercise 6.13 If W U U is a subspace of V and there is a vector w € W ~ U, then 
forallue U,wt+ueWuUU butw+u ¢ U. This forces w+ u € W andu € W. 


Thus, U Cc W. 
&(w) 


Exercise 6.14 Every vector w € V can be written as w = i@  UtU, where 
u=w— aa -v €ker§, because & (Ww — a . v) = &(w) — a -E(v) = 0. 


Exercise 6.15 Use induction on the number of subspaces. The key case of two 
subspaces is covered by Corollary 6.8 on p. 140. 


Exercise 6.16 Since each vector v € V has a unique representation as a sum v = 
>You; with u; € U;, the addition map ® U; > V, (uy, u2,...,Um) >» Youj, is 
bijective. Clearly, it is also linear. 


Exercise 6.17 Use the equalities pr + 78 = ps = pg + Gs. 


Exercise 6.20 The center of mass c for the total collection of all points p;, qj is 
defined by 


Saati + wat =o. 
i j 


Substitute i = op + PPis i = cq + i in this equality and use the conditions 
> Appi = 0, Y wiqq = 0 to get (= aie + (= 1j)ea = 0. 

Exercise 6.23 Similar to Exercise 5.8 on p. 106. 

Exercise 6.24 Bijectivity means that F(v) = F(w) — > v—wekerF. 


5Compare with Exercise 6.7 on p. 134. 
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Exercise 6.26 If a; —a, € B, then Aa, —Aay = A(a; —az) € Bas soonasA-BC B. 
Exercise 7.5 If some finite linear combination )*;A;(1 — at)! vanishes in k[#], 
then 7; Ai T],~;(1— vt) = 0 in k[/]. The substitution rt = a; | leads to the equality 
A; = 0 

Exercise 7.6 The vectors v},U2,...,U, are linearly independent, because an 
evaluation of &; on both sides of the linear relation A;v,; + Azu2 + +--+ Anv_, = 0 
shows that A; = 0, and this holds for each i. Since dimV = n, the vectors 
U1, U2,...,Un form a basis. The relations g;(v;) = 1 and g;(vj) = 0 fori = j 
say that ¢1, 92,..., Qn form the dual basis in V*. 

Exercise 7.7 Include w as a basic vector in some basis of W and take & to be the 
coordinate form w* along w. 


Exercise 7.8 If a linear form vanishes on some vectors, than it vanishes on all linear 
combinations of those vectors. 


Exercise 7.12 (v, G*F*E) = (Gu, F*E) = (FGu, &). 


Exercise 8.1 For units e’,e” € A, we have e’ = e’- e” = e”. Foralla,b € A we 
have 0-a = (b+ (-1)- b)a = ba+ (—1)ba = 0. 


Exercise 8.2. We have 


Ex forj=k, 


EE = ; 
0 otherwise. 


In particular, E}2E; 4 E,E)2.The complete list of commutation relations among 
the Ej; looks like this: 


Ej —Ej forj=kandi=¢, 


Hi forj = kandi # ¢, 

[Eij. Exc] S EjExe — EyeEy = it or j andi # 
EK forj A kandi=@, 
0 otherwise. 


Exercise8.5 Let AB = C, B'A' = D. Then cy = Diaby = Vip by = 
S Diah,; = dj. 


Exercise 8.8 Write u),u2,...,u, for the nonzero rows of the reduced echelon 
matrix in question and j,j2,...,,, for its shape. Then the j,th coordinate of }° A,u, 
equals A, for allk = 1, 2,... , r. Thus )> A,u, = 0 only if all A, are equal to zero. 


Exercise 8.9 The projection p : k” —» E,; along E; maps U onto the linear span of 
the rows of the submatrix formed by the columns indexed by J. The latter coincides 
with the whole of EF; if and only if its dimension satisfies dim E; = r. 
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Exercise 8.10 The original system is recovered from the transformed one by an 
elementary transformation of the same type. 


Exercise 8.11 Each d; = dim ;(U) is equal to the number of nonzero rows in the 
submatrix of A formed by the first 7 columns. 


Exercise 8.12 A vector subspace of codimension r?+ )-7_, (jy —v+ 1) in Mat,xn(k) 
is formed by matrices having zeros in the columns (j,j2,...,/,) and at the first j, 
positions of the vth row for each v = 1,..., 7. A shift of this subspace by the 
matrix E;, which has the unit rx 9 submatrix in the columns (/j),j2,...,j,) and zeros 
in all other positions, consists exactly of reduced echelon matrices of combinatorial 


type (ji, jo... Jr). 


10 OO 
: _. OL O00 

E ise 8.15 Th ti 
xercise 8 e inverse matrix is | | 0 52 
00-177 


Exercise 8.16 See Exercise 8.1. 


Exercise 8.18 Verify that det(FG) = det F-det G for 2x2 matrices. This forces det F 
to be invertible for invertible F, because of det F - det F—! det(FF—!) = detE = 1. 
If det F is invertible, then formula (8.7) can be proved by the same arguments as in 
Example 8.5. 


Exercise 8.19 Use the equalities 


ab\ _ [(ba\(01 Sad Ob\  (b0\/01 

c0) \0c/\10 cd) \dc)\1 0)" 
Exercise 9.1 Write a permutation g = (g1, g2,..., Zn) of symbols {1, 2, ... , n} as 
g = 0 ° g’, where o swaps the symbols n and gy. Then g’ = o © g preserves the 


symbol n. By induction, g’ equals a composition of transpositions preserving n. 


Exercise 9.2 If only transversal double crossings of threads are allowed, two threads 
drawn from i and j have an odd number of intersections® if and only if the pair (i, /) 


is reversed by g. For a shuffle permutation (i,, i2,..., i, j1,j2,---;Jm), the threads 
going from ij, i2,...,i,% do not intersect each other but do intersect, respectively, 
ij —1l,in—2,... , if —k threads going from j-points situated on the left. Since those 


j-threads also do not intersect each other, 


sen i, ia, ... stk, finda -.-sJm) = >) (iy —v). 
v 


®In fact, the threads can always be drawn in such a way that this number equals either 0 or 1 for 
any two threads. 
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Exercise 9.3 If g = 0) -02 --- ox, where all s; are transpositions, then h = s, --- 525) 
satisfies the equalities hg = gh = Id, because of s;s; = Id for all i. Hence, g~! = h 
and sgn (g~!) = (—1)* = sgn(g). 

Exercise9.5 Write v;,v2,...,U, for the columns of C. The condition rkC < n 
implies that these vectors are linearly related. For instance, let v; = es AkUk. 
Then det(v1, v2,..., Un) = open An det(vg, v2, ..., Un) =O. 


Exercise 9.6 Use induction on m. For m = 1, the vanishing condition means 
that f € k[x] has more than degf roots. Hence f = 0 by Corollary 3.2 on p.50. 
For m > 1, write f € k[x1,x2,...,Xm] as a polynomial in x,, with coefficients in 
k[x1,x2,...,Xm—1] and evaluate the coefficients at an arbitrary point g € k”~!. This 
leads to a polynomial in k[x,,] vanishing at each point of k. Hence, it must be the 
zero polynomial. This forces each coefficient of f to vanish at all g € k”!. By 
induction, all coefficients have to be zero polynomials, i.e., f = 0. 


Exercise 9.8 If det F 4 0, then dimim F = dim V, because otherwise, det F = 0 
by Exercise 9.5. Thus F is surjective and hence invertible if det F 4 0. In this case, 
det F- det F~! = detId = 1. 


Exercise 9.9 To move each &; through all the & from right to left, one needs m 
transpositions. 


Exercise 9.10 For even n, such a basis consists of all monomials of even degree, 
while for odd n, it consists of even-degree monomials and the highest-degree 
monomial & A & A --- A &,, which has odd degree. 


Exercise 9.11 Just transpose everything and use the equality detA = det A’. 
Exercise 9.12 The result follows at once from Proposition 9.3 and Corollary 6.4 on 
p. 136. 

Exercise 10.1 Linearity is a basic property of the integral. Positivity holds because 
a nonzero nonnegative continuous function is positive over some interval. 

Exercise 10.2 The linear form 8c; sends the basis vector e; to (e;, e;). Thus, the ith 
coordinate of this form in the dual basis equals (e;, e;). 


Exercise 10.5 If u and w either are linearly independent or have (u,v) < 0, then 
strict inequality holds in the computation made two rows below formula (10.21) on 
p. 239. 


Exercise 10.6 The Euclidean dual basis to the basis uw; = 1, uw. = t, u3 = inU 
is formed by vectors uy whose coordinates in the basis u; are in the columns of the 
matrix 


i ei 9 36 30 
Cv = G,;' =| 1/21/31/4) = |-36 192 —180 
1/3 1/4 1/5 30 —180 180 
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By Theorem 10.1, the orthogonal projection myv = )°(v, u;) - uy’ has coordinates 


: f2 1, (36) 1 (30 ~1/20 
—7°|-36]-<-| 192 }-<-|-180] =| 3/5 
30 ~180 180 ~3/2 


in the basis u1, u2, u3. Thus, the required polynomial is 


3 3 1 
j=f =fPeof top, 
ree 2 Ss 20 
1 
and the value of the integral / f(t) dt is equal to 
0 
1 1/2 1/3 1/4 —1/20 
1/2 1/3 1/4 1/5 3/5 1 
—1/20 3/5 —3/2 1): . == s 
(-1/ / /21) 1/3 1/4 1/5 1/6 —3/2 2800 
1/4.1/5 1/6 1/7 1 


Exercise 10.9 The transition matrix has det © — 1 7 ) > 0. 
v 


Exercise 10.10 Let IT; = a+ W,, Tz = b+ W3. Since W; 4 Wo, dim(W,N W2) < 
dim W, = dim V — 1. Hence 


dim(W, + W>) = dim W, + dim W> — dim(W; N W) 
> 2(dim V — 1) — (dimV — 1) = dimV—1. 


This forces dim(W; + W2) = dim V, that is, V = W, + W2 and dim(W; N W2) = 
dim V — 2. The intersection 


TT, N Ty = {a+ Wi} {hb + Wo} 


is not empty, because there are some w; € W; such that a — b = w2 — wy, that is, 
at+w, =b+w>. 

Exercise 10.12 The first three statements are obvious from formula (10.27) on 
p. 246, while the fourth follows immediately from the second and the third: writing 
v € Vasv = Ae+ u, where u € e+, we get |o.(v)| = |u—Ae| = Jul? + A2 = 
ju + Ae| = |v]. 

Exercise 10.13 Vectors a,b € V with |a| = |b| 4 0 are transposed by the reflection 
in the middle perpendicular hyperplane’ to [a, b]. To prove the second statement, 


7See Example 10.7 on p. 241. 
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use induction on n = dim V. The base case n = 1 is trivial.® In the generic case, 
choose an arbitrary nonzero vector v € V and compose a given isometry F with a 
reflection o sending Fv to v. Since oF preserves v, it sends the (n — 1)-dimensional 
subspace v+ to itself. By induction, the restriction of oF to v+ is a composition of 
at most (n — 1) reflections. The latter can be extended to reflections of the whole 
space V in the hyperplanes containing v. Thus, oF : V =~ V is a composition of at 
most (n — 1) reflections. Hence F = oo F is a composition of at most n reflections. 


Exercise 11.2 The cardinality of the left-hand side is equal to the number of nonzero 
vectors in ag divided by the number of nonzero vectors in a line, that is, it is equal 
to (q"*! — 1)/(q— 1). On the right-hand side is g” + g”! + --- +q+1.A good 
reason to call it a geometric progression, isn’t it? 

Exercise 11.3 In terms of polynomials, a change of basis in V* means a 
linear change of variables (x0,x1,...,X,) by variables (yo,y],.--,¥n) = 
(x0,X1, +--+, %) - C, where C € Mat,(k) is some invertible matrix. Since each 
monomial in y can be written as a combination of monomials in x, the function 
V — k produced by a monomial in y lies in the linear span of functions produced 
by monomials in x. Since C is invertible, the converse statement is also true. Thus, 
the linear spans of the polynomial functions produced by the monomials in x and in 
y coincide. 


Exercise 11.8 Answer: ay —lifdimV=n+1. 


Exercise 11.9 Let L; = P(U), L, = P(W), p = P(k-e). Sincep ?V=W®@®k-e. 
The projection in question is induced by the linear projection of V onto W onto W 
along k - e. Since k- e M L; = O, the restriction of this linear subspace U has zero 
kernel. 

Exercise 11.10 Write y, (respectively g,) for the linear fractional transformations 
sending pi, p2, p3 (respectively gi, qz2, q3) to oo, 0, 1. If [pi, po, p3, ps] = 
[41. 92, 93, qa], then ~(p4) = (ga), and the composition 9; | ° @, sends pi, 
P2, P3, Pa tO G1, G2, 93, 94. Conversely, if there is a linear fractional transformation 
Pap Sending p; to qg;, then the composition @, ° Vip takes q1, 92, q3, q4 to oo, 0, 1, 
[P1, P2, P3, pa] respectively. This forces [p), p2, p3, pal = [q1. 92, 93, Wal. 
Exercise 12.2 If, then 

Exercise 12.3 Taking h; = hy = h for some hh € H, we conclude that e € H. 
Taking h; = e, hy = h, we conclude that h" €H forallh € H. Taking h; = g, 
hy = h™", we conclude that gh € H forall g,h € H. 

Exercise 12.4 AutG = GL)(F,) = i ~Z/(p-—1). 

Exercise 12.5 Answer: n(n — 1)---(n — m + 1)/m (the numerator consists of m 
factors). 

Exercise 12.6 Let k = dr, m = ds, where d € N and GCD(r,s) = 1. If d > 1, then 
r4 splits into d nonintersecting cycles of length s, and t* = (x4 y’ is the composition 


8See Example 10.10 on p. 245. 
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of the r powers of these cycles. If GCD(m, k) = 1, consider an arbitrary element a 
appearing in t and write r for the minimal positive exponent such that (x*)’ a=a. 
Then m | kr. Hence, m | r. Therefore, r > m. This forces t* to be a cycle of length m. 


Exercise 12.7 Let cycles t,, t2 commute and have a common element a. Then 
T(a) also lies in t, because otherwise, t2T,(a) = 1(a) but (a) # a forces 
T,T2(a) # 7 (a). For the same reason, T2(a) lies in t;. Hence, both cycles consist of 
the same elements. In particular, they have equal lengths. Let t;(a) = t5(a). Then 
each element b appearing in the cycles can be written as b = 13(a), and t; acts onb 
exactly as t; does: 


1 (b) = 1113 (a) = 130) (a) = 1315(a) = Ht5(a) = 15(d). 


Thus, t; = 13, and by Exercise 12.6, s is coprime to the length of the cycles. 


Exercise 12.8 Two of the n! fillings of a given Young diagram produce equal 
permutations’ if they are obtained from each other by cyclically permuting the 
elements within rows or arbitrarily permuting rows of equal lengths in their entirety 
(compare with formula (1.11) on p.7). 

Exercise 12.9 |1,6,3,4)' - |2,5,8) - |7,9) = |1,6,3,4)7! - |7,9) = 
(4, 2, 6, 3, 5, 1, 9, 8, 7). 

Exercise 12.13. Answer: |1,2,3,4) = 012023034, |1,2,4,3) = oj2024034, 
[1,3,2,4) = 013023024, |1,3,4,2) = 013034024, |1, 4,2, 3) 024023013 , 
1, 4, 3; 2) = 034023012. 


Exercise 12.15 The calculations for the cube are similar to those for the tetrahedron 
and dodecahedron. The groups of the octahedron are isomorphic to the groups 
of the cube with vertices in the centers of the octahedral faces. The same holds 
for the icosahedron and dodecahedron (just look at the models you made in 
Exercise 12.10). 


Exercise 12.16 Fix a basis in V and send the operator F € GL(V) to the basis 
formed by vectors f; = F(e;). The first vector f; of that basis can be chosen in 
|V| — 1 = q" —1 ways, the second in |V| — |k- fi] = gq" — q ways, the third in 
|V|—|k- ft @k- fl = q" — ¢@ ways, etc. 

Exercise 12.17 Each vertex of the dodecahedron belongs to exactly two of five 
cubes. These two cubes have exactly two common vertices, and these vertices are 
opposite vertices of the dodecahedron. Therefore, a rotation preserving two cubes is 
forced to be a rotation about the axis passing through the common opposite vertices 
of the cubes. Taking the other two cubes, we conclude that the only proper isometry 
preserving four cubes is the identity map. 


Exercise 12.18 Hint: The central symmetry commutes with all elements of Odoa, 
whereas in Ss there are no such elements except for the unit. 


° Assuming that the permutation obtained from such a filling is the product of cycles written in 
rows and read from left to right. 
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Exercise 12.23 We sketch the arguments for the case of the icosahedron. The 
icosahedral group acts transitively on the 20 faces of the icosahedron. The stabilizer 
of a face in the complete (respectively proper) icosahedral group is the complete 
(respectively proper) group of the regular triangle in the Euclidean plane. Therefore, 
|Oico| = 20-6 = 120 (respectively |SOico| = 20-3 = 60). 

Exercise 12.24 Since elements gig3!, g2g;! are inverse to each other, if one of 
them lies in H, then the other lies there, too. The equality hjg; = h2go forces 
818)! = hy'hy € H. Conversely, if g1g5! = h € H, then Hg; = Hhgy = Hg, 
because Hh = H forallh € H. 

Exercise 12.25 The inclusion gHg~' C H is equivalent to the inclusion H C 
g'Hg. If this holds for all g € G, then we can substitute g by g~! in the second 
inclusion and get gHg™! D H. 


Exercise 12.26 go Ad, og! :ht> g(gg '(h)g') = og) hy(g). 
Exercise 12.27 Let g € A(V) and p = y7!(q). Then g(p + v) = g+ D,(v) and 


Pe e~ | :qHO(ptrv) = q+tD,(v). 


Exercise 12.29 If g(x) € No, then g(gxg!) = (g)e(x)o(g)! € N2, because 
N> <1 G» is normal. Hence, N; = gy! (No) <1 G,. The composition of surjective 
homomorphisms G; —> Gz —> G/N is a surjection with kernel Ny. 


Exercise 13.2 Write ¢ for the variable taking the values +1. When we insert x°x * 
in some word w, we get a word in which removal of any fragment y°y © leads either 
back! to w or to a word obtained from w by removal of the same fragment y’y* 
followed by inserting x°x* at the same place as in w. 


Exercise 13.3 Sendn € N to x"yx" € F and use Proposition 13.1 on p. 310. 


Exercise 13.5 Multiply both sides by the word from the right-hand side written in 
the reverse order. This leads either to (x;x2)" = e or to x2 (x1x2)"x5! =e, 


Exercise 13.6 If some geodesic passes through some vertex v of the triangulation, 
then it breaks the 2m; edges outgoing from v in two parts, each consisting of m; 
edges, such that any two edges cut by the same reflection plane occur in different 
parts. Therefore, when we vary a or b and the geodesic (a, b) meets the vertex v, a 
fragment of length m; in w is replaced by the complementary fragment of the same 
length m;. This proves the first statement. The second can be proved by induction on 
the length of the word representing g. If it is zero, then g = e, and there is nothing to 
prove. Assume that the statement is true for all g representable by a word of length 
at most n. It is enough to show that it holds for all elements go;. Triangle go; is 
obtained from triangle g by reflection in the plane g(z;), which passes through their 
common side and breaks the sphere into two hemispheres. If triangles go; and e lie 
in the same hemisphere, we can choose points a, b, b’ in triangles e, go;, g in sucha 


'ONote that this may happen not only when we remove exactly the same fragment x°x—* that was 


just inserted; sometimes, one of the letters x** can be canceled by another neighbor. 
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way that the shortest geodesic a, b’ passes through b and crosses the common side of 
triangles go;, g. This geodesic produces words w and wx; such that y(w) = go; and 
y(wx;) = g. By induction, wx; is the shortest word for g. Since go; is represented by 
an even shorter word, the statement holds for go; by induction as well. If triangles 
go; and e lie in different hemispheres, then the shortest geodesic from e to go; can 
be chosen coming into triangle go; from triangle g through their common side. This 
geodesic produces words wx; and w for go; and g. If wx; is equivalent to some 
shorter word, then the latter is of length at most n, and the statement holds for go; 
by induction. There remains to consider only the case that wx; is the shortest word 
for gs;. But that is exactly what we need. 


Exercise 13.7 If the first statement is not evident, see Sect. 13.2.2 on p.321. 
Relations 07 = e and 0,0; = 0;0; for |i—j| > 2 are obvious. It suffices to verify the 
relation 0;0;410; = 0;410;0;+1 only in $3 = D3. 

Exercise 13.8 Let A” = A(V) and write (po, pi, ---, Pn) and (go, 91, ---, Gn) for 
the points in question. As we have seen in Sect. 6.5.5 on p. 148, the required map 
A(V) > A(V) takes x go + F (pox), where F : V — V is some linear map 
sending Povi to Jodi for all 1 < i < n. Since both collections of vectors PoPi and 
q04i form bases of V, a linear map F exists and is unique and bijective. 


Exercise 13.9 Let v; = Ce; = e; —c be the radius vector of vertex i in R’*!. Then 
ny = v;— v; is orthogonal to the hyperplane 7, because for every k ¥ i,j, the inner 
product is given by 


(ny, ve — (vi + 0,)/2) = (vi, vg) — (Uj, vg) + (Ui, U1) /2 — (yj, 4) /2 = 0 


(products (v;, v;) for i A j and squares (v;, v;) do not depend on i,j by symmetry). 
A similar computation shows that nj and nj are orthogonal for {i,j} N {k, m} = ©. 
Vectors v; — vz and vz, — v; span the Euclidean plane, where they are the sides of an 
equilateral triangle with vertices v;, v;, ug. Therefore, the angle between nx and nj 
equals 60°. 


Exercise 13.10 The argument is word for word the same as in Exercise 13.6. 


Exercise 13.11 If /(g) = 0, then g has no inversions. If J(g) = n(n + 1)/2, then all 
the pairs must be reversed for g. 


Exercise 13.14 An epimorphism $4 —» D3 from Example 12.10 on p.291 maps 
Ag C S4 onto the cyclic subgroup of rotations. 


Exercise 13.15 Since C <1 D, the product (AM D)C is a normal subgroup of 
D by Proposition 12.5 on p.302. The isomorphism HN/N ~ H/H (WN from 
Proposition 12.5 for G = D, H = BN D,N = (AN D)C is the required 
(BN D)C/(AN D)C x~ (BN D)/(AN D)(BN C), because for A C B, we have 
HN = (BND)(AND)C = (BND)C, and the equality HNN = (BND)N(AND) = 
(AN D)(BN C) holds (any element d = ac € (BN D)N (AND), where d € BND, 
ace AND,ceC,hasc=a!de€ CNB). 


Exercise 13.16 The requisite odd permutation is either a transposition of two fixed 
elements or a transposition of two elements forming a cycle of length 2. 
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Exercise 13.17 For every 5-cycle t there is some g € Ss such that grg7! = 
|1,2,3,4,5). If g is odd, then hth"! = |2,1,3,4,5) for even h = oy2g. The 
assignment T +> 0j2TO}2 establishes a bijection between 5-cycles conjugate to 
|1,2,3,4,5) in As and 5-cycles conjugate to |2, 1,3, 4,5) 

Exercise 13.18 Reducing |H| = 12¢, + 122 + 20€3 + 15€4 + 1 modulo 3, 4, and 
5, we get | — €3, 1 — €4, and 1 + 2(e; + &2) respectively. Thus, H is divisible by 3 
or 4 only if ¢3 = 1 or €4 = 1. In both cases, |H| > 16. Hence, |A| is neither 3 nor 4 
nor 3-4 nor3-5. If |H|:5, then e; = ¢2 = 1 and |A| = 25. Hence, |A| is neither 5 
nor 4-5. The remaining possibilities are only |H| = 1 and |H| = 3-4-5. 
Exercise 13.20 Choose k ¢ i,j, g~'(i). Then g(k) = m ¢ {i,j,k}. If n > 6, there 
exists an even permutation / that preserves i, j, k and sends m to some £ 4 m. Then 
hgh"! mapsit> jandktwe 4m. 

Exercise 13.21 Associativity is checked as follows: 


(x1, /11) + (2, 2) + (3,03) = (x1 Wn, (2). Aha) - (x3, 3) 
= (x1 Vn (x2) Wi ho (x3) , hyhoh3), 
(x1, 1) + (x2, 2) « (%3,3)) = (1, 1) + (x2, (3). Arh) 


= (x1Vn, (x2 Why (x3)) , Iyhohs), 


and Wn, (x2Wnz(%3)) = Wry 2) Way ° Vin 03) = Wn, 02) Wingy (3). Multiplication by 
the unit yields (x, h) - (e,e) = (x, Wa(e), he) = (x,h), because Wa (e) = e (Wr is a 
group homomorphism). What about the inverse element, (Wi, a Came h') -(x,h) = 
(vi, la wy LG -) Lh 'h) = (e,e)? 
Exercise 13.22 Since yw. = Idy (because y : H > AutN is a homomorphism), we 
have (x1, €) - (2, e) = (41 We(x2), €) = (41%, €). 

Hence, N’ is a subgroup and is isomorphic to NV. The normality of N’ is checked 
as follows: 


(yh) 0) (Wir O").A) = On). 1) (Wie O7),A) = (YWn@y™",) - 


All the statements about H’ are evident. The isomorphism N’ » H’ ~ N xy H holds 
because Ad(e,n) (x, €) = (Wa), e). 


Exercise 13.23 Write G for the group and C = Z(G) for its center. If |C| = p, then 
C = Z/(p) x G/C. Let C be generated by a € C and G/C be generated by bC for 
some b € G. Then every element of G can be written as b‘a’”. Since a commutes 
with b, any two such elements commute. 


Exercise 13.24 For every k € Z, the multiplication-by-m € Z map 


m:Z/(k) > Z/(k), [x] > mb] = [x] + +--+ + Be] = [ma], 
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is a well-defined endomorphism of the additive group Z/(k). For k = m, it takes the 
whole of Z/(m) to zero, whereas for k = n, it is injective, because of GCD(m, n) = 
1. For every additive homomorphism w : Z/(m) > Z/(n) and all x € Z/(m), we 
have m- w(x) = w(m- x) = (0) = 0 in Z/(n). This forces w(x) = 0, because 
multiplication by m is injective in Z/(n). 

Exercise 14.1 The same arguments as in Exercise 5.8 on p. 106. 


Exercise 14.8 If A;m; = 0 and Aym, = O for some nonzero 4; € K, then 
Aj Aa(m1 +m2) = Oand A;A2 0, because K has no zero divisors. Finally, V uw € K 
Ai(um) = A2(fum2) = 0. 

Exercise 14.11 Let A’ = A +x anda’ = a+v for some x € J, v € IM. Then 
Ma = hat (xa + Av + xv), where the sum in parentheses lies in IM. 


Exercise 14.16 For any subspace U C V, choose a basis E in U and complete it to 
some basis E LI F in V. Then V = U © W, where W is the linear span of F. 


Exercise 14.17 In the Grassmannian algebra K<&, &,...,&n>, consider two 
collections of homogeneous linear forms: 7 = €-A and¢ = 7-C = &-F, where 
F = AC. The Grassmannian monomials of degree k in n and ¢ are ny = >), ayy 
and Cx = >>, &xfix. Since 6; = }°, nyc, we conclude that fix = >, arcu. 
Exercise 14.18 Let the resulting transformation of rows (respectively columns) in 
C be provided by left (respectively right) multiplication by the invertible matrix S = 
S;-++S2S1 (respectively by R = R-R2 --- Re). Then F = S;---S2S,E (respectively 
G=ER,-R2--- Re). 

Exercise 14.19 The equivalence of the first three conditions is evident in any 
reciprocal bases of Z” and L. The last two conditions are equivalent by the definition 
of the rank of a matrix. 


Exercise 14.20 Clearly, g, = 0 forn > m. LetO < n < m. If g(x) = 0, 
then p"x = py for some y € K. Since K has no zero divisors, we conclude 
that x = p” "y. Conversely, if x = p” "y, then p’x = 0 modp”. Therefore, 
ker @, = iM @m—n. Finally, the assignment x (mod p”) > p” "x (mod p”) produces 
a well-defined K-linear injection K/ (p") > K/ (p™) that isomorphically maps K/ 
(p”) onto im Qy—p. 

Exercise 14.21 We can assume that M = K/(p”). In this case, ker g;/ ker gj 
consists of classes [p""‘x] € K/(p') modulo classes of the form [p’‘ty]. Since 
multiplication by p annihilates them all, multiplication by the elements of the 
quotient K/(p) is well defined in ker g;/ ker gj_1. 

Exercise 14.22 Answer: either for r = 1 and all n; = 0, or for r = O and distinct 
P1,P25--+-+ »>Pa- 

Exercise 15.2 For n = 1, the space k[#]/ (ft) ~ k is simple. For n > 1, the image 
of multiplication by [#] is a proper invariant subspace in k[#]/ (¢"). Let k[#]/ (#”) = 
U ® W, where U, W each goes to itself under multiplication by [f]. If U, W both 
are contained in the image of multiplication by [#], then U + W also is contained 
there and cannot coincide with V. Thus at least one of the subspaces U, W, say U, 
contains some class [f] represented by the polynomial f with nonzero constant term. 
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Then f, tf,... , f’'f are linearly independent in k[#]/ (¢”). Hence dim U > n and 
U= V. 

Exercise 15.3 If V = U@W, where U, W both are F-invariant, then V* = AnnU® 
Ann W, and both subspaces Ann U, Ann W are F*-invariant. Indeed, if § € Ann U, 
then Vu € U,( F*E, u) = (&, Fu) = 0, because Fu € U. Hence, F*€ € Ann U. 


Exercise 15.4 Write each (F;,U;) as formula (15.1) on p.363 and use the 
uniqueness statement of Theorem 15.1 on p. 363 


Exercise 15.5 The vectors g; are computed recursively in decreasing order of 
indices as 2m-1 = Nm, gi = hit) + Agi+i1. AS for the remainder, r = ho + Fygo = 
ho+Fy (i +Fyg1) = ho+Fy (hi +A(h2+Fyg2)) SS ho + hFy apes e hy FY. 


Exercise 15.6 In the dual bases e and e* of V and V*, the matrices tE — F, and tE — 
F*. are transposes of each other. Hence they have the same Smith forms (convince 
yourself of this!) and the same elementary divisors. 


Exercise 15.7 Choose a basis compatible with the decomposition V = U @ W. In 


ee ). The result 


this basis, the matrix tE — F has block-diagonal form 
0 t—-H 


follows from the formula 


det ({ | = detA- detC, 


valid for every A € Mat,(k), C € Mat,,(k), B € Mat, x(k) and verified by the 
Laplace expansion of the determinant in the first 7 columns. 


Exercise 15.8 Let f = f+ a)t7! +-+++ a,-1t + ay. Write F for the matrix of 
multiplication by f in the basis r’~!, r’~-?, ... , t, 1 and expand det(tE — F) along 
the first column. 


Exercise 15.9 Since multiplication by the product of all elementary divisors 
annihilates the direct sum in formula (15.1) on p. 363, we have yr(F) = 0 for every 
operator F’. This proves the Cayley—Hamilton identity for matrices over any field. In 
the greatest generality, the Cayley—Hamilton identity 74(A) = 0 for n x n matrices 
A= (aj) can be treated as a collection of n* identities in the ring of polynomials 
with integer coefficients in n? variables a;;. To prove them, it is enough to check that 
in evaluating these polynomial relations at all points of the coordinate space Qr , we 
get numerical identities. In other words, the Cayley—Hamilton identity for rational 
matrices implies the Cayley—Hamilton identity for matrices over any commutative 
ring. 

Exercise 15.10 Since every vector h € H can be written ash = u+q +r, where 
ué U,q€Q,reé R, the equality h = m(h) = x(u) + x(r) holds in H. Since 
m(u) = u € Uand z(r) € W, we conclude that U + W = H. Ifu e UN W, 
then u = z(r) for some r € R and x(u—r) = a(u) — a(r) = u—u = O. Thus, 
u—r€kerz = Q. This is possible only for u = r = 0. Hence, UN W = 0. 
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Exercise 15.11 The 1-dimensional isometry is Id; the improper 2-dimensional 
isometry is the orthogonal direct sum of 1-dimensional isometries +Id and —Id; the 
proper 2-dimensional isometry is a rotation. 


Exercise 15.12 For f 4 g, the operators have different characteristic polynomials 
and cannot be similar. 


Exercise 15.13 Let A € Spec F, f(A) # 0, and let v € V be an eigenvector of F 
with eigenvalue A. Then f(F)v = f(A)-v #0. 


Exercise 15.14 A nilpotent operator cannot have nonzero eigenvalues. Hence, all 
roots of y(t) equal zero. Over an algebraically closed field, this forces yr(f) = r’”. 
Conversely, if yr(t) = 7”, then F” = 0 by the Cayley—Hamilton identity. 


Exercise 15.15 For example, the multiplication by ¢ in the residue class module 
k[¢]/ (t") for n > 2. 


Exercise 15.16 For instance, the rotation of the Euclidean plane R? by 90° about 
the origin. 


Exercise 15.18 Factorization of the characteristic polynomial 


ar) = [] @-a™ 
A€Spec F 
satisfies Proposition 15.7 for gj = (t— A)’ and ker(A Id — F)* = Ky. 


Exercise 15.19 If a’ = 0, b” = 0, and ab = ba, then (Aa + poy = 0 by 
Newton’s binomial formula. 


Exercise 15.20 The map (15.17) is clearly C-linear. The relation s(fg) = s(f)s(g) 
can be verified separately for each jetj;’. By the Leibniz rule, 


(w= ({) pret 
v+p=k 


Therefore, the following congruences modulo mod(t — 4)” hold: 


— Ay k! 
A= DSF LD rasa 
aa ial 


v+py=k 


Ar Xr 
= yew. OO ay tne (s). 


k v+p=k 


Exercise 15.21 Over C, one could use Proposition 15.9. Over an arbitrary field k, 
the operator F with matrix J,,(A) equals Ald + N, where N” = 0 but N”"! 4 0. 
Then the inverse operator 


T= (Aid +N)! =A 'dd+N/A) | 
= 7! =) 7N + 73 N2 ai (ea ame? ae ae 
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equals A~'Id + M, where M = —A~?N(1 — A7!N + ---) also has M” = 0 but 
M"-! = )20-yr-! & (0). Therefore, the Jordan normal form of F~! is exhausted 
by just one block J, (A~'). 


Exercise 16.3 In the language of correlations, the condition 6) (u, w) = Bo(fu, fw) 
means that (u, Biw) = (fu, Bofw) = (u, f* Bofw) . 
Exercise 16.4 Let B = (b,) be the matrix of B : V > V*. Then bj is equal to the 
ith coordinate of the covector Be; in the basis e*),e*,...,e*,, that is, to the value 
of the covector Be; on the vector e; = e**. The latter is 6 (e;, ¢)). 
Exercise 16.6 Answer: dim Homs(V, V*) = n(n+ 1)/2. These are the dimensions 
of the spaces of symmetric and skew-symmetric matrices. 

100 
Exercise 16.7 E.g.,B=1|0 0 1 

0-10 
Exercise 16.8 For example, the linear span of the vectors e, + ién4), 1 < v <n. 
Exercise 16.9 If B € Mat,(k) has B’ = —B, then detB = det B’ = det(—B) = 
(—1)" det B. 
Exercise 16.10 The dual basis is 1, —D, D?/2,... , (—1)"D"/n!. 
Exercise 16.12 “(fY) = (B*)~!((B)“1f*B)*B* =f" =f. 
Exercise 16.13 The equality B(u, w) = B(gu, gw) holds for all u, w € V if and only 
if g* Bg = B. The latter forces det g 4 0 and is equivalent to B-!g*B = g™!. 


Exercise 16.14 Since the canonical operator of W, ((-1)*") and the canonical 
operator of U, @ U; have the same elementary divisors, they are similar. Hence, the 
forms are equivalent by Theorem 16.1 on p. 401. 

Exercise 16.15 Write 0 = ui, uy, u'2,...,u'm, 1 < i < k, for Jordan chains of 
some Jordan basis in V. Then the vectors u, form a basis in im 7”~!; the vectors ui, 
for 1 < k < m forma basis in ker 7”!. By Lemma 16.3 on p. 406, every vector ui 
is orthogonal to all vectors ul with k < m. 


Exercise 17.1 Let ¢), e2,...,¢, forma basis in V and suppose v, w € V is expanded 
in terms of this basis as u = )- xje;, w = > yie;. If we write g as in formula (17.2) 
on p. 421, we get 


q(u+ w) — qu) —q(w) = («+ y)BQ! + y') —xBx' — yBy! = xBy' + yBx' = 2 xBy'. 


(The last equality holds because yBx' is a 1 x 1 matrix and therefore is symmetric, 
ie., yBx' = (yBx')' = xBy' = xBy', since B = B’.) The other statements are 
verified similarly. 

Exercise 17.2 For every v’, v” € Vandw',u” € kerg, we have q(v'+u'’, v"” +u") = 
q(v’, v"). Hence q(v) depends only on [v] = v (mod ker)g. If u € ker dyea, then 
Grea((v], [u]) = G(v, u) = 0 for all v € V. This forces u € kergq and [u] = 0. 


Exercise 17.3 If g(e,) = g(e2) = 0 and g(e1, e2) = a ¥ 0, then the vectors e; and 
éo/a form a hyperbolic basis for q. 
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Exercise 17.5 The reflection ofe) acts identically on f(e)+ and maps f(e) & 
—f(e) = f(—e). The composition f ° o, ° f—! does the same, because f—! maps 
f(e)+ & e+ for an orthogonal operator f. 

Exercise 17.10 Let P(V) = P(Ann&)UP(Ann n) for some nonzero covectors &, 1) € 
V*. Then the quadratic form q(v) = &(v)n(v) vanishes identically on V. Therefore, 
its polarization g(u,w) = (q(u + w) — q(u) — q(w))/2 also vanishes. Hence, the 
Gramian of gq is zero, i.e., g is the zero polynomial. However, the polynomial ring 
has no zero divisors. 


Exercise 17.11 Every rank-1 matrix is a product cr of some column c and some 
row r. For a symmetric matrix, r = c’. Thus, a quadratic form of rank | looks like 
q(x) = xec'x'! = (xc)*. A singular quadratic surface in P3 is either a double plane, 
or two crossing planes, or a cone over a smooth plane conic,!' or a line, or a point. 
The last two cases are impossible over an algebraically closed field. 


Exercise 17.12 Identify P; = P(U) with the Veronese conic C C P, = P(S?U*). 
Let o swap some points A;, Az and swap some other points B,, Bz on C. The lines 
(A,A2) and (B), Bz) on Py = P(S*U*) cross at some point P = {a,b}. Then a,b € 
P, are the fixed points of o. Indeed, the pencil of lines passing through P defines an 
involution on C that swaps the intersection points of the lines with C. This involution 
coincides with o, because it acts on the four points A;, B; exactly as o does. The fixed 
points of this involution are those such that the tangent lines drawn from P to C meet 
C, ie., {a, a} and {b, b}. 

Exercise 17.13 Substituting x = a + th in g(x) = 0, we get 2¢(a, b)t + g(b) . 
The root t = 0 gives x = a; the root t = g(a,b)/q(b) for g(a,b) # 0 gives 
x=c=q(b)-a+q(a,b)-b. 

Exercise 17.14 The conics passing through a given point p € P, = P(V) forma 
hyperplane in P; = P(S?V*), because the equation g(p) = 0 is linear in g. Now (a) 
holds, because every five hyperplanes in P; have nonempty intersection. To prove 
(c), note that a singular conic C is either a double line or a pair of intersecting lines. 
This forces some three points of any five points lying on C to be collinear. Therefore, 
every conic passing through five points without collinear triples is smooth. It is 
unique by Proposition 17.6 on p. 438. To prove (b), it remains to consider the case 
that some three of the five points lie on a line £; and the remaining two points lie on 
the other line £, 4 £. In this case, the conic is forced to be £; U £5. 


Exercise 17.15 Since the space of quadrics in P3; = [P(V) has dimension 
dim P(S?V*) = 9, every nine hyperplanes in P(S?V*) have nonempty intersection. 
This gives (a). To get (b), choose three points on each line and draw a quadric 
through them. To verify the last statement, note that there are no three mutually skew 
lines on a singular quadric by Exercise 17.11 on p.437. A smooth quadric S = Z(q) 
containing lines must be given by a hyperbolic form q. Hence, S is isomorphic to the 
Segre quadric. Such a quadric passing through given skew lines @, ¢’, ¢” is unique, 


'That is, the union of lines (sa), where s € P3 is fixed and a runs through some smooth conic in 
the plane complementary to s. 


Hints to Selected Exercises 541 


because for every point a € @, there exists a unique line (b'b”) 5 a with D’ € &, 
b” € ”. This line lies on a quadric, and the lines coming from all points a € £ rule 
a quadric. 


Exercise 17.16 Use Proposition 17.5 on p. 436 and prove that a nonempty smooth 
quadric over an infinite field cannot be covered by a finite number of hyperplanes. 


Exercise 17.17 The Gramian of g in a basis c, e€1, @2,...,€, of W = k @ V, where 


the e; form a basis in V, has block form ¢ i) where fo € k, f, € V*, fo € S?V*. 


t 
1 J2 


Since g : ct Axo, where A # 0, we conclude that f; = 0, fo 4 0. 

Exercise 17.20 Every point of Sing QN Hq is clearly singular for Q... Conversely, 
let v € Sing Qo. andc € Sing Q~ Hoo. Then W = k-c@Ann x, and v is orthogonal 
to both Ann xp = V and c. Hence, v € SingQN Hoo. 

Exercise 18.1 The formal literal expressions checking associativity and distributiv- 
ity identically coincide with those verifying the same rules in the field C defined as 
the residue ring R[x]/(x? + 1). 

Exercise 18.3 The equalities Few = Aw and Few = Aw are conjugate, because 
Fe(w) = Fe(w) for all w € Ve. 

Exercise 18.6 See comments to Exercise 18.1. 


Exercise 19.3 Compare with Exercise 16.4 on p. 389. 

Exercise 19.5 The form hww : W > C maps w’ + (w’,w). The form F*hyw : 
U — C maps u +> (Fu,w). The form hyFt : U > C maps u + (u, Fw). 
Therefore, the coincidence hyFY = F*hw means the equality (Fu, w) = (u, F'w) 
for all u, w. 

Exercise 19.6 The first assertion: (zFu,w) = (Fu,Zw) = (u, F'Zw) = (u,ZF tw). 
The second: (FGu, w) = (Gu, F'w) = (u, G'F'w). 

Exercise 19.7 Since a unitary operator F is invertible, (Fu, w) = (Fu, FF~'w) = 
(u, F~'w) for all u, w € W. 

Exercise 19.8 Consider Mat,(C) as a real vector space of dimension 2n* with a 
basis formed by the matrices Ej, iE, where Ej has 1 in the (i, j)th position and 0 
in all other places. Write xj, yj for the coordinates in this basis. Then the matrix 
(Fi) = (xj) + i- Qi) is unitary if and only if the following quadratic equations!” 
hold: 


Ve@i +y,) =1 forl <i<n, 
v 
Y-Gvity + itv) = Ova — Hiv) =0, forl <i<j<n. 
v vp 


Hence, U,, is a closed subset in Mat, (C). Adding together all the equations written in 
the top row, we conclude that U,, is contained within a ball of radius ./n centered at 


They expand the matrix equation F‘F = E. 
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the origin. Therefore, U, is compact. A diagonal matrix D with diagonal elements of 
the form e” is connected with the identity matrix E by a smooth path y : [0,1] > 
Un sending ¢ to the diagonal matrix with elements e’. Since every F € Uy, can 
be written as F = CDC™! for some C € Uy, and diagonal D as above, the path 
tt> C-y(t)- C7! lies within U, and connects E with F. 


Exercise 19.9 The same arguments as in the proof of Proposition 19.3 on p.492. 
The only difference is that now Spec Fc splits into conjugate pairs of complex 
numbers having unit length, that is, equal to cos } + isin}. The corresponding v,, 
U2 span the Euclidean plane U, on which F|,y acts as rotation through the angle ?. 


Exercise 19.10 Arguments independent of Theorem 19.1 are sketched in Prob- 
lem 19.23 on p. 502. 


Exercise 19.13 Use the equality (u, F'w) = (Fu, w). 
Exercise 19.14 The proof of Theorem 19.2 works without any changes. 


Exercise 19.15 For every u = >> x;e; € U, F(u) = a xifi + Q2%2f2 +--+ + a,x; 
because F(e;) = 0 forj > r. 


Exercise 19.16 The proof of Theorem 19.3 works without any changes. 


Exercise 19.17 Write wu), v2,..., Un and wy, W2,..., Wm for the orthonormal bases in 
U, W, where the matrix of the orthogonal projection U — W is as in Theorem 19.3, 
and let a; > a2 > --- > a, be singular values of the projection. Then (u;, w;) = a, 
and all other (u;, w;) are equal to zero. Thus, for every u = Y- xi, w= yo xywj, 
(uw) = Yo aixiyi < a1 DL, xii < a - |x| - |y| (the last inequality holds by 
formula (10.12) on p. 235). Therefore, cos 4 (u,w) = (u,w)/(|x|- |y]) < a = 
(ui, W1). 

Exercise 19.18 The coincidence of singular values together with dim U’ = dim U”, 
dim W’ = dim W” forces dim(U’ + W’) = dim(U” + W”) and dim(U’ N W’) = 
dim(U” NW”). Therefore, we can simultaneously identify U’ + W’ with U” + W", 
U' NW’ with U” NW”, and W’ with W” by an isometry of the ambient space. It 
remains to remove the discrepancy of U’, U” within the orthogonal complement 
to U' NW’ = U"” 0 W" taken in U' + W’ = U” + W”. Thus, we can assume 
that the ambient space is W @ W1, where W = W’ = W’", and the subspaces 
U',U"” C W® W? are complementary to W. Such subspaces U C W @ W? are 
in bijection with the linear maps F : W+ —> W: the map F matches its graph U = 
{(v, Fv) | v € W+} and is recovered from U as F = me m\', where : U > W 
is the projection along W+, and 2, : U = W+ is the projection!’ along W. The 
subspaces U’, U"” € W @ W+ lie in the same orbit of the action O(W) x O(W+) C 
O(W © W~) if and only if the corresponding operators F’, F” : Wt —> W satisfy 
F' = SF'T for some T € O(W+), S € O(W). By Corollary 19.5, the latter means 
that the singular values of F’ and F” coincide. It remains to note that if we write 
the matrix of the orthogonal projection z : U — W and the matrix of the operator 
Fy : W+ — W in the same basis of W and those bases in U and W* that go to 


'3Tt is one-to-one, since WN U = 0. 
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each other under the orthogonal projection U + W+, then we get exactly the same 
matrices. 


Exercise 20.1 The nonzero elements of the Gram matrix are 
det(E11, Ex) = det(Ex, £11) =1 and det(Ey2, Ey) = det(E1, Ey) = —1. 


Since for invertible matrices we have nY = det(n)n~!, for such matrices we must 
have (no)Y = det(n)(nt)~“! = detéé—'n“! detn = CNY. Since the relation 
is linear in each matrix and invertible matrices linearly span Mato, it holds for 
noninvertible matrices as well. 
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angle between 
complex lines, 486 
Euclidean subspaces, 498 
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quadratic form, 424 
subspace, 424 
vector, 424 
anti-Hermitian matrix, 467 
anti-self-adjoint 


component of operator, 401, 488, 489 
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antilinear map, 463 
Appell polynomials, 89 
argument of complex number, 56 
associate elements, 111 
associated triangle, 275, 291 
associative algebra, 173 
associativity, 11, 19, 195, 279 
asymptotic 

directions, 265, 445 

quadric, 445 
automorphism, 3 

inner, 295 

of quadric, 440 

of set, 3 

outer, 295 
autopolar triangle, 456 
axiom 

of choice, 12 

of nontriviality, 20 
axis of paraboloid, 494 
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singular, 390 
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binary 
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quadratic form, 424 
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reflexive, 8 
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symmetric, 8 
transitive, 8 
binomial, 6 
coefficient, 6, 17, 85 
expansion, 85 
series, 84 
biorthogonal subspaces, 405 
birational bijection, 269, 
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canonical commutative 
isomorphism V ~ V**, 158 algebra, 173 
operator, 397 diagram, 168 
Cantor—Schréder—Bernstein ring, 21 
theorem, 135 reduced, 28 
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Catalan numbers, 86 triangle, 168 
Cauchy—Riemann commutativity, 19 
differential equations, 461 commutator 
relations, 460 in algebra, 201 
Cauchy—Schwarz in group, 306 
Euclidean inequality, 235 of matrices, 201, 385 
Hermitian inequality, 483 subgroup, 306 
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parametrization, 501 complementary 
theorem on representation of group, 294 subgroups, 328 
Cayley—Hamilton identity, 222, 340, 366 subspaces 
center, 199 projective, 268 
of Grassmannian algebra, 216 vector, 140 
of group, 295, 330 complete 
of mass, 144 coordinate flag, 191 
of pencil of lines, 266 flag, 518 
centralizer, 295 poset, 15 
chain in poset, 14 quadrangle, 274 
characteristic, 36 complex 
function, 130 conjugation, 58 
polynomial of vectors, 463, 466 
of bilinear form, 392 eigenvector, 464 
of endomorphism, 226 number, 55 
of matrix, 222 plane, 58 
of operator, 366 structure, 467 
of recurrence relation, 82 structures 
value, 392 conjugate, 478 
chart affine, 253 complex-differentiable function, 461 
Chebyshev polynomials, 251 complexification 
Chinese remainder theorem, 34, 54, 120 of bilinear form, 465 
chirality, 513,514 of linear map, 463 
closure, 265 of vector space, 462 
cocube, 250 component 
codimension, 138 nilpotent, 379 
coefficient of bilinear form 
binomial, 85 skew-symmetric, 391 
leading, 44 symmetric, 391 
lowest, 42 of operator 
multinomial, 5 anti-self-adjoint, 401, 488, 
cofactor, 218 489 
expansion, 218 self-adjoint, 401, 488, 489 
cokernel, 170 semisimple, 379 
combination composition, 279 
barycentric, 144 factor, 324 
convex, 145 length, 324 
linear, 127, 130 of maps, 10 
combinatorial type of a subspace, 191 series, 324 


commensurable subgroup, 350 cone, 449, 453 
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congruence 
modulo n, 27 
modulo ideal, 106 
modulo polynomial, 52 
conic, 435 
conjugate 
complex 
numbers, 58 
structures, 478 
vectors with respect to real structure, 
463, 466 
operators, 361 
points with respect to a quadric, 
443 
conjugation 
by group element, 295 
classes, 297 
constant term, 42 
content of polynomial, 116 
contraction, 161 
contraoriented bases, 234 
convex 
figure, 145 
hull, 145 
coordinate form, 156 
coordinates, 128 
barycentric, 154 
homogeneous, 254 
local affine, 255 
of vector in basis, 128 
cooriented bases, 234 
coprime 
elements 
in commutative ring, 26 
in PID, 111 
ideals, 120 
integers, 26 
polynomials, 48 
correlation 
left, 389 
right, 387 
skew-symmetric, 391 
symmetric, 391 
coset 
left, 300 
of ideal, 106 
right, 300 
countable set, 3 
covector, 155 
Cramer’s rule, 127, 223 
criterion 
Eisenstein’s, 119 
of Krull, 122 
cross product, 507 
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cross ratio, 69, 272 
on conic, 456 
cube, 248, 283 
group of, 292 
cycle, 281 
cyclic 
basis, 368 
group, 62, 281 
operator, 370 
permutation, 281 
subgroup, 280 
type 
of linear operator, 367 
of permutation, 282 
vector, 370 
cyclotomic polynomial, 61, 70 
cylinder, 450 


Darboux’s theorem, 410 
decomposable 

bilinear form, 405 

operator, 362 
decomposition 

Jordan, 378 

of homomorphism, 107, 149, 302 

polar, 495 

root, 376 

SVD, 498 
degenerate 

matrix, 210 

quadratic form, 423 
degree 

function in Euclidean domain, 109 

lowest, 42 

of monomial, 17 

of polynomial, 44 

total, 151 
dependent variables, 188 
derivative 

logarithmic, 83 

of a power series, 44 
Desargues’s theorem, 276 
determinant, 210 

of Gram, 233 

of Sylvester, 227 
diagonal 

action, 305 

main, 197 

matrix, 170 

secondary, 197 
diagonalizable operator, 372 
diagram 

Newton, 96 
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Young, 6 

Young’s, 17 
difference 

of sets, | 

operator, 90 

symmetric, 130 
differential 

of affine map, 148, 445 

of function C > C, 461 

operator, 394, 490 
differentiation rules, 45 
dihedron, 284 
dimension 

of affine space, 142 

of vector space, 133 
direct product 

of abelian groups, 30 

of commutative rings, 30 

of groups, 330 

of sets, 2 

of subgroups, 328 

of vector spaces, 141 
direct sum 

of submodules, 342 

of vector spaces, 141 

of vector subspaces, 140 
direction subspace, 145 
discriminant, 68 
disjoint union, | 
distance, 238 
distributivity, 20, 195 
divisibility, 24 
division, 23 

algebra, 507 

of polynomials, 46 

with remainder, 109 
dodecahedron, 283 

group of, 293 
domain 

Euclidean, 109 

integral, 28 

principal ideal, 109 

unique factorization, 112 
double 

line, 437 

point, 435 
dual 

bases, 158 

basis 

left, 395 
right, 395 

linear map, 164 

projective spaces, 276 

vector spaces, 155, 158 


duality 
polar, 443 
projective, 276 


echelon 
matrix, 183 
pyramid, 250 
eigenspace, 372 
eigenvalue, 371 
eigenvector, 170, 371 
complex, 464 
Eisenstein’s criterion, 119 
element 
algebraic, 54, 175 
associate, 111 
generating, 309, 310 
idempotent, 39 
identity, 23 
inverse, 19, 279 
invertible, 24, 174 
irreducible, 48, 71 
maximal in poset, 14 
minimal in poset, 14 
neutral, 19, 23 
noninvertible, 24 
of infinite order, 281 
of torsion, 341 
opposite, 19 
prime, 114 
reducible, 48, 71 
transcendental, 175 
unit, 19 
elementary divisors, 351, 
aoe 
of operator, 363 
theorem on, 352 
elements 
compatible, 14 
incompatible, 14 
ellipsoid, 447, 450 
imaginary, 447 
elliptic 
paraboloid, 449, 451 
quadric, 441 
empty set, | 
endomorphism, 3 
algebra, 174 
Frobenius, 37 
linear 
diagonalizable, 170 
normal, 252 
of set, 3 
epimorphism, 2 
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equality 
of maps, 2 
of power series, 41 
of sets, | 
equation 
ax + by = k, 24 
z’=ainC, 6l 
of Markov, 419 
equidistant, 241 
equivalence, 8 
of quadratic forms, 422 
of quadrics, 435, 444 
equivalence relation, 8 
Euclidean 
adjoint operator, 489 
algorithm 
for k[y], 48 
for Z, 25 
bilinear form, 393 
distance, 238 
domain, 109 
dual basis, 236 
length of vector, 230 
space, 229 
structure, 229 
standard on R”, 229 


standard on integrable functions, 230 


valuation, 109 
volume, 234 
Euler’s 
bilinear form, 394 
four-square identity, 507 
function, 38, 39 
y-function, 29 
products, 68 
theorem 
on residues, 38 
on rotations, 247 
pentagonal, 101 
evaluation 
form, 158 
map, 4, 108, 121 
of polynomial, 47 
even permutation, 208 
exact sequence, 169 
exceptional basis, 395, 418 
exponential, 83 
extension of fields, 53 


factor 
group, 302 
ring, 107 
set, 8 


factorial, 4 
factorization theorem 

for integers, 27 

for PID, 115 

for polynomials, 48 
Fermat’s little theorem, 30 
fiber of map, 2 


Fibonacci numbers, 81, 226, 382 


field, 19 
C, 55 


>, of two elements, 20 


R, of real numbers, 21 
algebraically closed, 55 
extension, 53 
finite, 63, 64 
noncommutative, 507 
of fractions, 75 
of Laurent series, 76 
of rational functions, 76 

field of fractions, 75 

finite fields, 63 

finite length group, 324 


finite-dimensional vector space, 133 


finitely generated 
group, 310 


module, 127, 337, 344, 352 
finitely presented group, 311 


five lemma, 169 

flag, 191, 518 

forgetful map, 5 

form 

bilinear, 229, 387 

decomposable, 405 
degenerate, 390 
Euclidean, 393 
Euler, 394 
hyperbolic, 393, 425 
indecomposable, 405 
nondegenerate, 390 
nonsingular, 390 
nonsymmetric, 391 
regular, 392 
singular, 390 


skew-symmetric, 391, 408 
symmetric, 229, 391, 408 


symplectic, 393 

positive, 229 

quadratic, 421 
anisotropic, 424 
binary, 424 
degenerate, 423 
hyperbolic, 424, 425 
negative, 433 


Q, of rational numbers, 20 


Index 


Index 


nondegenerate, 423 Gramian, 233, 389 
nonsingular, 423 of quadratic form, 422 
over F,, 432 greatest common divisor, 24, 110 
positive, 433 in an arbitrary ring, 27 
real, 433 in principal ideal domain, 110 
singular, 423 in unique factorization domain, 115 
sesquilinear, 469 of polynomials, 48 
formal power series, 41 group, 279 
formula abelian, 13, 21 
binomial, 85 additive, of ring, 22 
of Burnside—Pélya—Redfield, 298 affine, 148 
of Lagrange, 159 commutative, 13 
of orbit length, 296 cyclic, 13, 62, 281 
of Taylor, 160 dihedral, 284 
Viéte’s, 67 finitely 
fraction, 9, 73 generated, 310 
fractional power series, 92 presented, 311 
Fredholm alternative, 138 free, 309 
free Klein, 284 
group, 309 linear 
module, 336 affine, 148 
variable, 188 general, 174, 179 
Frobenius endomorphism, 37 orthogonal, 244 
function, 2 projective, 271 
characteristic, 130 special, 214 
complex-differentiable, 461 symplectic, 412 
rational, 76 multiplicative, of field, 22 
real-differentiable, 461 of cube, 292 
functional, 155 of dodecahedron, 287, 293 
of figure 
complete, 283 
Gaussian proper, 283 
elimination, 182 of finite length, 324 
over PID, 346 of homotheties, 290 
integers, 62, 115 of inner automorphisms, 295 
lemma of invertible elements, 28 
on irreducible polynomials, 117 of invertible residue classes, 28 
on quadratic residues, 40, 523 of isometries, 396 
quadratic reciprocity law, 66 special, 396 
general linear group, 174 of Mathieu, 307 
generating of relations, 311 
elements, 310 of roots of unity, 60 
series, 80 of tetrahedron, 286 
set, 310 of transformations, 12 
generator, 309 of triangle, 285 
of algebra, 109 of units, 28 
of cyclic group, 62, 281 orthogonal, 244 
of group, 310 of quadratic form, 427 
Gram proper, 244 
determinant, 233 special, 244 
of quadratic form, 423 p-group, 330 
matrix, 233, 389 Qs, 304 
of quadratic form, 422 simple, 324 


Gram-Schmidt orthogonalization, 232, 482 special projective, 290 
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symmetric, 13 
symplectic, 412 
unitary, 484 
special, 484 
group action, 294 
adjoint, 295 
diagonal, 305 
exact, 294 
faithful, 294 
free, 294 
m-transitive, 294 
regular, 294 
left regular, 294 
right regular, 294 
transitive, 294 


harmonic (pairs of) points, 274, 443 
harmonicity, 274 
Hermite polynomials, 251 
Hermitian 
adjoint 
linear map, 487 
matrix, 466 
operator, 488 
dual basis, 485 
inner product, 481 
isometry, 483 
length of vector, 481 
matrix, 467, 482 
norm, 481 
space, 481 
structure, 470 
standard on C”, 470 
standard on space of functions, 470 
vector space, 470 
volume, 484 
Hilbert’s basis theorem, 104 
homogeneous 
coordinates, 254 
polynomial, 260 
homomorphism 
of abelian groups, 31 
of algebras, 173 
of fields, 34 
of groups, 279 
of rings, 33 
of spaces 
with bilinear forms, 388 
with operators, 361 
with quadratic forms, 422 
of K-modules, 336 
trivial, 33 
homothety, 154 
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Hopf bundle, 516 
Horner’s method, 47 
hyperbola, 264 
hyperbolic 

basis, 393, 425 

bilinear form, 393, 425 

paraboloid, 449, 451 

quadratic form, 425 

quadric, 441 

rotation, 428 
hyperboloid, 448 

of one sheet, 451 

of two sheets, 448, 450 
hyperplane, 138, 241 

at infinity, 253 

polar, 443 
hypersurface 

affine, 262 

projective, 263 


icosahedron, 283 
ideal, 103 
maximal, 107 
prime, 108 
principal, 103 
trivial, 103 
idempotent, 39, 374 
trivial, 39 
identity map, 3 
image 
of a point, 2 
of group homomorphism, 289 
of map, 2 
of ring homomorphism, 33 
imaginary 
ellipsoid, 447 
part, 56 
quaternion, 506 
vector, 463 
incompatible elements, 14 
indecomposable 
bilinear form, 405 
operator, 362 
index 
of inertia 
negative, 433 
positive, 433 
of prime p in &(F), 365 
of quadratic form, 433 
of subgroup, 300 
inertia 
index 
negative, 433 
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positive, 433 
moment, 144 
infinity, 253 
injection, 2 
inner automorphism, 295 
inner product 
Euclidean, 229 
Hermitian, 481 
integral domain, 28 
interpolating polynomial, 381 
intersection of sets, | 
invariant factors, 344, 
345 
theorem, 344 
inversion in permutation, 209 
inversion number, 209, 321 
invertible 
element of algebra, 174 
power series, 43 
involution, 58, 153, 304, 373 
projective, 276 
involutive permutation, 303 
irreducible 
element of commutative ring, 48, 71 
factorization, 112, 113 
factors, 113 
isometry 
for bilinear forms, 388 
for quadratic forms, 422, 
427 
group, 396 
Hermitian, 483 
of hyperbolic plane, 427 
isomorphism, 2 
of affine quadrics, 444 
of bilinear forms, 388 
of operators, 361 
of projective quadrics, 435 
of quadratic forms, 422 


of sets, 2 

isotropic 
subspace, 395, 425 
vector, 424 

jet, 380 

join, 437 

Jordan 
basis, 368, 376 
block, 375 


nilpotent, 368 
chain, 368, 405 
decomposition, 378 
normal form, 376 


Jordan—H6lder 
factor, 324 
series, 324 


Kahler triples, 471 
with given w, 473 
with given g, 472 
kernel 
of bilinear form, 409 
left, 391 
right, 391 
of group homomorphism, 32, 289 
of linear map, 125 
of ring homomorphism, 33 
Klein group, 284 
Koszul sign rule, 216 
Kronecker’s algorithm, 119 
Krull criterion, 122 


Lagrange’s 
interpolating polynomial, 50 
interpolation formula, 159 
theorem 
on index of subgroup, 300 
on quadratic forms, 409 
Lagrangian subspace, 413 
Laguerre polynomials, 251 
Laplace 
operator, 499 
relations, 218 
Laurent series, 76 
law of inertia, 434 
leading 
coefficient, 44 
term, 44 
least common multiple, 25 
left 
adjoint operator, 399 
correlation, 389 
coset, 300 
dual basis, 395 
inverse map, | 1 
kernel, 391 
orthogonal, 403 
regular action, 294 
Legendre polynomials, 251 
Legendre—Jacobi symbol, 65, 71 
Leibniz rule, 46, 201 
lemma 
Gauss’s 
on irreducible polynomials, 117 
on quadratic residues, 523 


555 


556 


Witt’s, 430 
Zassenhaus’s, 325 
Zorn’s, 16 
length 
of group, 324 
of permutation, 321 
of vector 
Euclidean, 230 
Hermitian, 481 
line 
affine, 146 
projective, 256, 266, 271, 437 
linear 
combination, 127, 130 
dependence, 131 
endomorphism 
cyclic, 370 
diagonalizable, 372 
nilpotent, 170, 367 
normal, 252 
semisimple, 368 
form, 155 
fractional transformation, 271 
functional, 155 
group 
general, 174 
projective, 271 
special, 214 
involution, 153 
join, 437 
map, 102, 125, 135, 149, 151 
adjoint, 252 
dual, 164 
isometric, 388 
projective transformation, 270 
projector, 153 
relations, 131, 150, 337, 338, 
356 
span, 139 
system of hypersurfaces, 266 
local affine coordinates, 255 
localization, 73 
logarithm, 83 
logarithmic derivative, 83 
lowest 
coefficient, 42 
degree, 42 
term, 42 


M6bius 
function, 39, 203 
inversion formula, 39, 203 
transform, 203 


map, 2 


affine, 148 
bijective, 2 
C-antilinear, 463 
forgetful, 5 
identity, 3 
increasing, 14 
injective, 2 
inverse, 12 
left, 11 
right, 11 
two-sided, 12 
K-linear, 336 
linear, 88, 102, 125, 135, 149, 151 
adjoint, 252, 488, 489 
dual, 164 
isometric, 244 
orthogonal, 244 
multilinear, 261 
nondecreasing, 14 
order-preserving, 14 
orthogonal 
improper, 244 
proper, 244 
polar, 443 
semilinear, 463 
surjective, 2 


Markov’s 


conjecture, 420 
equation, 419 


mass grouping theorem, 145 
Mathieu group, 307 
matrix, 129 


adjunct, 220, 455 
algebra, 178 
anti-Hermitian, 467 
antisymmetric, 227 
degenerate, 210 
diagonal, 170 
Gramian, 233 
Hermitian, 467, 482 
nilpotent, 202 
nondegenerate, 210 
of linear operator, 136 
orthogonal, 245 

over a noncommutative ring, 196 
reduced echelon, 183 
shifting, 381 
skew-Hermitian, 467 
skew-symmetric, 252 
standard basis, 129 
symmetric, 252 
symplectic, 412 
traceless, 252 
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Index 


triangular, 197 
unipotent, 202 
unitary, 484 
unitriangular, 197 
upper triangular, 252 
maximal 
element, 14 
ideal, 107 
median, 153 
method 
Gauss’s, 182, 346 
Horner’s, 47 
Newton’s, 96 
metric space, 238 
middle perpendicular, 242 
minimal 
element, 14 
polynomial 
of algebraic element, 54 
of an algebraic element, 
175 
of linear operator, 365 
word, 322 
minor, 217 
complementary, 218 
module, 124, 335 
cyclic, 357 
decomposable, 343 
finitely generated, 127 
free, 336 
indecomposable, 343 
Noetherian, 356 
semisimple, 343 
torsion-free, 341 
unital, 124, 335 
modulus of complex number, 56 
moment, 144 
monic polynomial, 44 
monomorphism, 2 
multilinear map, 261 
multinomial coefficient, 5 
multiple root, 51 
multiplication, 19 
of vectors by scalars, 123 
multiplicative 
character, 39 
subset of ring, 73 


negative 
inertia index, 433 
quadratic form, 433 
neighborhood, 70 
net of hypersurfaces, 266 


Newton 
diagram, 96 
polygon, 96 
Newton’s 
binomial 
modulo p, 29 
theorem, 6, 85 
with negative exponent, 80 
method, 96 
nilpotent, 28 
component, 379 
linear endomorphism, 170 
matrix, 202 
operator, 367 
nilradical, 121 
Noetherian ring, 104 
nondegenerate 
matrix, 210 
quadratic form, 423 
nonresidue, 65 
nonsingular quadratic form, 423 
nonsymmetric bilinear form, 391 
norm 
Hermitian, 481 
in Euclidean domain, 109 
of algebraic number, 113 
quaternionic, 506 
normal 
linear endomorphism, 252 
operator, 417 
number 
complex, 55 
of partitions, 101 
numbers 
Bernoulli, 91 
Catalan’s, 86 
Fibonacci, 81, 382 


octahedron, 283 
octaplex, 517 
odd permutation, 208 
open set in C, 70 
operation 
n-ary 
algebraic, 42 
binary, 19 
n-ary, 42 
operator, 361 
adjoint, 400, 411 
Hermitian, 487 
left, 399 
right, 399 
anti-self-adjoint, 401, 411 
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canonical, 397 


completely reducible, 368 


conjugate, 361 
cyclic, 370 
decomposable, 362 
diagonalizable, 372 
idempotent, 374 
indecomposable, 362 
involutive, 373 
irreducible, 362 
Laplacian, 499 
nilpotent, 367 
normal, 417, 491, 492 
reflexive, 400 
self-adjoint, 401, 411 
semisimple, 368 
Serre’s, 397 
simple, 362 
symplectic, 412 
unitary, 483 

operators, similar, 361 

orbit, 296 
length formula, 296 
map, 296 

order 
of element, 62 


in additive group, 356 


of group, 13, 280 
of group element, 281 


of invertible residue class, 38 


partial, 13 

total, 14 

well, 15 
origin, 143 
orthogonal, 236 


collection of vectors, 231 
complement, 236, 409, 485 


group, 244 


of quadratic form, 427 


special, 244 
left, 403 
map 
Euclidean, 244 
improper, 244 
proper, 244 
matrix, 245 
polynomials, 251 


projection, 237, 239, 409, 485 


right, 403 


orthogonalization procedure, 232, 482 


orthonormal 
basis, 393, 482 


collection of vectors, 231 


outer automorphism, 295 


pairing, 160 
perfect, 160 
Pappus’s theorem, 276 
paraboloid, 448 
elliptic, 449, 451 
hyperbolic, 449, 451 
parallel displacement, 142 
parity of permutation, 209 
partial 
fraction expansion, 77 
order, 13 
partition, 6 
number, 101 
pencil 
of correlations, 391 
of hypersurfaces, 266 
perfect 
pairing, 160 
square, 115, 268, 437 
permutation 
cyclic, 281 
even, 208 
involutive, 303 
odd, 208 
shuffle, 210 
perpendicular 
hyperplane, 241 
middle, 242 
vectors, 230 
perspective triangles, 276 
Pfaffian, 414 
p-group, 330 
planarity, 440 
plane 
affine, 146 
projective, 437 
Platonic solid, 283 
Pliicker’s 
quadric, 457 
relations, 227 
point 
double, 435 
singular, 435 
smooth, 435 
polar, 443 
decomposition, 495 
duality, 443 
hyperplane, 443 
map, 443 
polarity, 443 


polarization of quadratic form, 422 


pole, 443 
polynomial, 43 
Appell, 89 


Index 


Index 


characteristic, 222, 366 
of bilinear form, 392 
of endomorphism, 226 
of recurrence relation, 82 
Chebyshev, 251 
constant, 44 
cyclotomic, 61, 70 
harmonic, 499 
Hermite, 251 
homogeneous, 260 
integer-valued, 359 
interpolating, 50, 381 
Laguerre, 251 
Legendre, 251 
minimal, 54, 175, 365 
monic, 44 
on vector space, 260 
reciprocal, 417 
reduced, 44 
separable, 51 
symmetric, 151 
polynomials 
coprime, 48 
orthogonal, 251 
poset, 14 
locally finite, 202 
totally ordered, 14 
well ordered, 15 
positive 
inertia index, 433 
quadratic form, 433 
power 
function, 84 
series, 41 
binomial, 84 
generating, 80 
preimage, 2 
presentation of group, 311 
prime 
element, 114 
ideal, 108 
integer, 27 
subfield, 36 
primitive 
residue (modn) , 38 
root 
(modn) , 38 
of unity, 60 
principal 
axis of paraboloid, 494 
ideal, 103 
domain, 109 
principle 
Dirichlet’s, 3 


of transfinite induction, 15 
product 
cross product, 507 
direct 
of abelian groups, 30 
of commutative rings, 30 
of groups, 330 
of sets, 2 
of subgroups, 328 
direct, of vector spaces, 141 
inner 
Euclidean, 229 
Hermitian, 481 
of ideals, 120 
of matrices, 176 
semidirect 
of groups, 330 
of subgroups, 328 
projection 
in projective space, 269 
of conic onto line, 269, 438 
orthogonal, 239, 409, 485 
projective 
algebraic hypersurface, 263 
algebraic variety, 263 
closure, 265, 444 
duality, 276 


enhancement of affine quadric, 444 


equivalence of quadrics, 435 

line, 256 

quadric, 435 

root, 267 

space, 253 

subspace, 263 
projectivization, 253 
projector, 153, 374 
proper 

submodule, 335 

subset, | 
Puiseux series, 92 
pullback of linear forms, 165 
pure imaginary 

quaternion, 506 

vector, 463 
Pythagorean theorem, 231 


quadrangle, 291 
quadratic 
form, 421 
anisotropic, 424 
binary, 424 
degenerate, 423 
hyperbolic, 424, 425 
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negative, 433 
nondegenerate, 423 
nonsingular, 423 
over F’,, 432 
positive, 433 
real, 433 
singular, 423 
nonresidue, 65 
reciprocity, 66, 71, 225 
residue, 65 
surface, 435 
quadric 
affine, 444 
cone, 449 
cylinder, 450 
paraboloid, 448 
equivalent 
affinely, 444 
projectively, 435 
Pliicker’s, 457 
projective, 435 
real 
elliptic, 441 
hyperbolic, 441 
projective, 441 
Segre’s, 439, 456 
quadruple of points 
harmonic, 274 
special, 273 
quaternion, 505 
pure imaginary, 506 
real, 506 
quotient 
by group action, 296 
group, 302 
homomorphism, 107, 302 
map, 8 
of division, 110 
by polynomial, 47 
ring, 107 
set, 8 
space, 149 


radial vector, 143 
radical 
function, 86 
of ideal, 120 
rank 
of bilinear form, 391 
of free module, 343 
of matrix, 166 
over principal ideal domain, 359 
of quadratic form, 423 


rational normal curve, 267 
real 
number, 21 
part, 56 
quadratic form, 433 
quaternion, 506 
structure, 466 
vector, 463 
real-differentiable function, 461 
realification, 459 
reciprocal 
bases, 344 
polynomial, 417 
reduced 
echelon form, 183 
polynomial, 44 
reducible element, 48, 71 
reduction 
modulo n, 9 
of coefficients, 118 
reflection, 246, 428 
reflexive 
binary relation, 8 
operator, 400 
reflexivity, 8 
regular 
bilinear form, 392 
polyhedron, 518 
relating set, 311 
relation 
binary 
equivalence, 8 
reflexive, 8 
skew-symmetric, 13 
symmetric, 8 
transitive, 8 
group, 311 
linear, 131, 337 
module, 337 
relations, 109 
of Cauchy—Riemann, 460 
of Laplace, 218 
of Pliicker, 227 
of Riemann, 477 
relator, 311 
remainder 
in Euclidean domain, 110 
of division by polynomial, 47 
representation of group, 294 
residue class 
invertible, 28 
modulo n, 27 
primitive, 38 
quadratic, 65 
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modulo an ideal, 106 
modulo polynomial, 52 
resultant, 228 
Riemann relations, 477 
right 
adjoint operator, 399 
coset, 300 
dual basis, 395 
inverse map, 11 
kernel, 391 
orthogonal, 403 
regular action, 294 
ring, 195 
commutative, 21 
Euclidean, 109 
reduced, 28 
with unit, 21 
Noetherian, 104 
of fractions, 73 
of Gaussian integers, 62 
unique factorization domain, 112 
with unit, 195 
root 
adjunction, 53 
decomposition, 376 
of polynomial, 50 
m-tuple, 51 
multiple, 51 
simple, 51 
of unity, 60 
primitive, 60 
primitive modulo n, 38 
projective, 267 
subspace, 376 
rotary dilation, 57 
rotation 
Euclidean, 245, 247 
hyperbolic, 428 


Schur’s theorem, 500 
section of surjective map, 11 
segment, 145 
in a poset, 202 
Segre’s 
embedding, 439 
quadric, 439, 456 
selection axiom, 12 
self-adjoint 
component of operator, 401, 488, 
489 
operator, 401, 411 
Euclidean, 489 
Hermitian, 488 
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semiaxes of Eucliden quadric, 493 
semidirect product 

of groups, 330 

of subgroups, 328 
semilinear map, 463 
semiorthonormal basis, 395 
semisimple 

component, 379 

operator, 368 
series 

antiderivative, 82 

fractional power, 92 

generating, 80 

Puiseux, 92 
Serre’s operator, 397 
sesquilinear form, 469 
set, | 

countable, 3 

empty, | 

generating, 310 

multiplicative, 73 

open in C, 70 

partially ordered, 14 

relating, 311 

ucountable, 3 
shape of echelon matrix, 183 
shift 

operator, 89, 394 

transformation, 142 
shifting matrix, 381 
shortest decomposition in transpositions, 

322 

shuffle permutation, 210 
Siegel upper half-space, 477 
sign of permutation, 209 
signature of quadratic form, 433 
similar operators, 361 
simple 

cone, 449 

group, 324 

ratio, 69 

root, 51 
simplex, 249, 318 
simplified fraction, 76 
singular 

locus, 435 

point, 435 

quadratic form, 423 

values, 498 

decomposition, 497 

skew 

Hermitian matrix, 467 

symmetric 

bilinear form, 391, 408 
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component of bilinear form, 
ol 
correlation, 391 
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Hermitian, 470 
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standard 
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theorem, 331 
Sylvester’s 
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law of inertia, 434 
symmetric 
algebra, 259 
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difference, 130 
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symplectic 
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operator, 412 
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